Skip to main content
Back

Foundations of Statistics: Data, Sampling, and Experimental Design

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Introduction to Statistics

Statistics is the science of planning studies and experiments, obtaining data, and organizing, summarizing, presenting, analyzing, and interpreting those data to draw conclusions. Understanding the foundational concepts of data, populations, samples, and statistical significance is essential for effective study and application of statistics.

Data and Types of Data

Definition of Data

  • Data: Collections of observations, such as measurements, genders, or survey responses.

Types of Data

  • Quantitative (Numerical) Data: Numbers representing counts or measurements. Example: heights, weights, ages.

  • Qualitative (Categorical) Data: Names or labels that represent categories. Example: gender, eye color.

Subtypes of Quantitative Data

  • Discrete Data: Data values are countable and finite. Example: number of students in a class.

  • Continuous Data: Data values are infinite and not countable, often measured. Example: lengths, weights.

Populations, Samples, and Censuses

Definitions

  • Population: The complete collection of all measurements or data that are being considered. It is the group about which we want to draw conclusions.

  • Sample: A subcollection of members selected from a population.

  • Census: The collection of data from every member of the population.

Example

  • In a survey of 1046 adults conducted by Bradley Corporation, subjects were asked how often they wash their hands when using a public restroom, and 70% of the respondents said "always." Here, the sample is the 1046 adults surveyed, and the population is all adults who use a public restroom.

Statistical Significance and Practical Significance

Definitions

  • Statistical Significance: Results are statistically significant if they are unlikely to occur by chance.

  • Practical Significance: Results have practical significance if they are large enough to be meaningful in real-world terms.

Example

  • If a program increases IQ scores by 3 points with a 25% chance, it may be statistically significant but not practically significant, as 3 points may not make a meaningful difference.

Common Pitfalls in Data Collection and Interpretation

  • Correlation Does Not Imply Causation: Just because two variables are correlated does not mean one causes the other.

  • Loaded Questions: Questions intentionally worded to elicit a desired response can bias results.

  • Voluntary Response Sample: A sample in which respondents decide whether to participate, often leading to bias.

  • Nonresponse/Low Response: When a significant portion of the sample does not respond, results may be biased.

  • Errors in Math: Be careful with ratios, fractions, decimals, and percentages. For example, a decrease of 100% is the maximum possible decrease.

Parameters and Statistics

  • Parameter: A numerical measurement describing some characteristic of a population.

  • Statistic: A numerical measurement describing some characteristic of a sample.

Example

  • If a study of all 70,081 people who attended the Super Bowl is conducted, the average age calculated is a parameter. If a sample of 400 babies is studied and 51% are girls, 51% is a statistic if it refers to the sample, or a parameter if it refers to the entire population.

Levels of Measurement

  • Nominal: Categories only; cannot be arranged in order. Example: eye color, names.

  • Ordinal: Categories can be arranged in order, but differences are not meaningful. Example: course grades.

  • Interval: Data can be ordered, and differences are meaningful, but there is no natural zero. Example: temperature in Celsius.

  • Ratio: Data can be ordered, differences are meaningful, and there is a natural zero. Example: heights, weights.

Experimental Design and Observational Studies

Definitions

  • Experiment: Some treatment is applied, and the effects are observed.

  • Observational Study: Observes and measures characteristics without modifying the subjects.

  • Lurking Variable: A variable that affects the variables in the study but is not included in the study.

Design of Experiments

  • Replication: Repetition of an experiment on more than one individual to ensure reliability.

  • Single-blind: Subjects do not know whether they are receiving a treatment or placebo.

  • Double-blind: Neither subjects nor researchers know who receives the treatment or placebo.

  • Randomization: Assigning subjects to groups by random selection.

  • Placebo Effect: Improvement due to the belief in the treatment rather than the treatment itself.

Sampling Methods

  • Simple Random Sample: Every possible sample of a given size has the same chance of being selected.

  • Systematic Sample: Select every k-th member of the population.

  • Convenience Sample: Use data that are easy to obtain.

  • Stratified Sample: Divide the population into subgroups (strata) and sample from each stratum.

  • Cluster Sample: Divide the population into clusters, randomly select some clusters, and use all members from those clusters.

  • Multistage Sample: Combine several sampling methods.

Summary Table: Types of Data and Levels of Measurement

Type of Data

Definition

Example

Quantitative (Numerical)

Numbers representing counts or measurements

Height, weight, age

Qualitative (Categorical)

Names or labels representing categories

Gender, eye color

Level of Measurement

Definition

Example

Nominal

Categories only, no order

Eye color

Ordinal

Ordered categories, differences not meaningful

Course grades (A, B, C)

Interval

Ordered, differences meaningful, no true zero

Temperature (°C)

Ratio

Ordered, differences meaningful, true zero

Height, weight

Key Formulas

  • Percentage Calculation:

  • Sample Mean:

  • Population Mean:

Conclusion

Understanding the basic terminology and concepts in statistics is crucial for collecting, analyzing, and interpreting data accurately. Mastery of these foundational ideas prepares students for more advanced topics in statistical inference and data analysis.

Pearson Logo

Study Prep