BackDefining and Collecting Data: Foundations of Business Statistics
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Defining and Collecting Data
Objectives of Data Collection in Business Statistics
Understanding how to define, classify, and collect data is fundamental in business statistics. This topic introduces the key concepts and processes involved in preparing data for analysis.
Defining Variables: Recognize the importance of clearly specifying what is being measured or observed.
Measurement Scales: Understand the different ways variables can be measured and categorized.
Data Collection Methods: Learn how to gather data efficiently and accurately.
Sampling Techniques: Identify various methods for selecting representative samples.
Survey Errors: Be aware of common errors and biases in data collection.
Classifying Variables
Types of Variables
Variables are the characteristics or properties that are measured or observed in a study. They are classified as follows:
Categorical (Qualitative) Variables: Take on values that are categories, such as "yes"/"no" or colors like "blue", "brown", "green".
Numerical (Quantitative) Variables: Represent quantities that can be counted or measured.
Discrete Variables: Arise from a counting process (e.g., number of text messages sent).
Continuous Variables: Arise from a measuring process (e.g., time taken to download an app).
Examples of Variable Types
Question | Responses | Variable Type |
|---|---|---|
Do you have a Facebook profile? | Yes or No | Categorical |
How many text messages have you sent in the past three days? | Numerical | Discrete |
How long did the mobile app update take to download? | Numerical | Continuous |
Measurement Scales
Types of Measurement Scales
Measurement scales determine how variables are categorized and interpreted:
Nominal Scale: Classifies data into distinct categories with no implied ranking. Example: Cellular provider (AT&T, Sprint, Verizon, Other, None).
Ordinal Scale: Categorizes data with a meaningful order but without consistent intervals. Example: Student class designation (Freshman, Sophomore, Junior, Senior).
Interval Scale: Ordered scale where differences between values are meaningful, but there is no true zero point. Example: Temperature in degrees Celsius or Fahrenheit.
Ratio Scale: Ordered scale with meaningful differences and a true zero point. Example: Height, weight, age, salary.
Interval and Ratio Scales Table
Numerical Variable | Level of Measurement |
|---|---|
Temperature (Celsius/Fahrenheit) | Interval |
Standardized exam score | Interval |
Height (inches/centimeters) | Ratio |
Weight (pounds/kilograms) | Ratio |
Age (years/days) | Ratio |
Salary (dollars/yen) | Ratio |
Types of Variables: Classification
Hierarchical Structure of Variables
Variables can be organized as follows:
Type | Subtypes | Examples |
|---|---|---|
Categorical | Nominal | Marital Status, Political Party, Eye Color |
Categorical | Ordinal | Ratings (Good, Better, Best), Student Grades (A, B, C, D) |
Numerical | Discrete | Number of Children, Defects per hour |
Numerical | Continuous | Weight, Voltage |
Population and Sample
Definitions and Importance
Understanding the distinction between population and sample is crucial for statistical inference:
Population: All items or individuals of interest in a study.
Sample: A subset of the population selected for analysis.
Population vs. Sample Table
Population | Sample |
|---|---|
All items/individuals about which you want to reach conclusions | A portion of the population |
Size 40 (example) | Size 4 (example) |
Parameters and Statistics
Key Concepts
Parameter: A summary value describing a characteristic of a population.
Statistic: A summary value describing a characteristic of a sample.
Example: If 30% of a sample of mall shoppers used the food court, 30% is the statistic; the true proportion in the population is the parameter.
Sources of Data
Data Collection Activities
Capturing data from ongoing business activities
Distributing data compiled by organizations or individuals
Compiling survey responses
Conducting designed experiments
Conducting observational studies
Examples
Business: Fraud detection from transaction records
Economics: Forecasting using search engine data
Marketing: Website effectiveness tracking
Surveys: Product satisfaction, political polls
Experiments: Product testing, material selection
Observational: Focus groups, traffic measurement
Primary vs. Secondary Data Sources
Definitions
Primary Source: Data collected directly by the analyst (e.g., surveys, experiments).
Secondary Source: Data collected by others and used for analysis (e.g., census data, published reports).
Sampling Frame and Sampling Methods
Sampling Frame
A list of items that make up the population.
Frames can be population lists, directories, or maps.
Excluding groups from the frame can lead to bias.
Types of Samples
Sample Type | Subtypes | Description |
|---|---|---|
Nonprobability | Convenience | Easy or inexpensive to sample |
Nonprobability | Judgment | Expert opinions |
Probability | Simple Random | Equal chance for all items |
Probability | Systematic | Select every k-th item |
Probability | Stratified | Sample from subgroups (strata) |
Probability | Cluster | Sample entire clusters |
Probability Sampling Methods
Simple Random Sample: Each item has an equal chance of selection.
Systematic Sample: Select every k-th item after a random start.
Stratified Sample: Divide population into strata and sample proportionally.
Cluster Sample: Divide population into clusters, randomly select clusters, and sample all or some items within clusters.
Survey Errors and Ethical Issues
Types of Survey Errors
Coverage Error: Some groups are excluded from the sampling frame.
Nonresponse Error: Differences between respondents and non-respondents.
Sampling Error: Variation due to random sampling.
Measurement Error: Poor question design or respondent error.
Ethical Issues
Intentional bias through coverage or nonresponse error
Failure to report margin of error
Leading questions or interviewer influence
Respondent dishonesty
Summary
This chapter covers the foundational concepts of defining and collecting data in business statistics, including variable classification, measurement scales, sampling methods, sources of data, and common errors and ethical considerations in survey research.