Skip to main content
Back

Statistics Study Guide: Data Types, Sampling, Descriptive Statistics, and Data Visualization

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Data Types and Measurement Scales

Qualitative and Quantitative Variables

Variables in statistics are classified based on their nature and the type of data they represent.

  • Qualitative (Categorical) Variables: Describe qualities or categories (e.g., brand of cell phone, color).

  • Quantitative Variables: Represent numerical values and can be measured.

  • Quantitative Discrete: Countable values (e.g., number of messages sent).

  • Quantitative Continuous: Measurable values within a range (e.g., monthly cell bill).

Example: The number of text messages sent in one month is a quantitative discrete variable.

Levels of Measurement

Variables can be measured at different levels:

  • Nominal: Categories without order (e.g., cell phone brand).

  • Ordinal: Categories with a meaningful order (e.g., rating stars).

  • Interval: Ordered, equal intervals, no true zero (e.g., temperature).

  • Ratio: Ordered, equal intervals, true zero (e.g., weight).

Example: The actual weight of cereal in a box is a ratio variable.

Descriptive and Inferential Statistics

Parameters vs. Statistics

A parameter describes a characteristic of a population, while a statistic describes a characteristic of a sample.

  • Parameter: The average salary of all employees at a company.

  • Statistic: The average salary of a sample of employees.

Experimental and Observational Studies

Studies can be classified as:

  • Experimental Study: Researcher manipulates variables (e.g., dividing patients into treatment and placebo groups).

  • Observational Study: Researcher observes without intervention (e.g., comparing cancer rates in different populations).

Sampling Methods

Types of Sampling

Sampling is the process of selecting a subset of individuals from a population.

  • Simple Random Sampling: Every member has an equal chance of selection.

  • Stratified Sampling: Population divided into subgroups (strata), samples taken from each.

  • Cluster Sampling: Population divided into clusters, some clusters are randomly selected.

  • Systematic Sampling: Every nth member is selected.

  • Convenience Sampling: Sample is taken from easily accessible members.

Example: Selecting every 5th cereal box from a shelf is systematic sampling.

Descriptive Statistics: Measures of Central Tendency and Spread

Mean, Median, and Mode

These are measures of central tendency:

  • Mean (μ or x̄): The average value.

  • Median: The middle value when data is ordered.

  • Mode: The most frequently occurring value.

Formula for Mean:

(population mean)

(sample mean)

Standard Deviation and Variance

These measure the spread of data:

  • Standard Deviation (σ for population, s for sample): Measures average distance from the mean.

  • Variance: The square of the standard deviation.

Formulas:

Population standard deviation:

Sample standard deviation:

Population variance:

Sample variance:

Range, Interquartile Range, and Outliers

  • Range: Difference between maximum and minimum values.

  • Interquartile Range (IQR):

  • Outliers: Data points outside or

Five Number Summary

  • Minimum

  • First Quartile ()

  • Median ()

  • Third Quartile ()

  • Maximum

Frequency Distributions and Data Visualization

Frequency Tables

Frequency tables summarize data by showing the number of occurrences for each category or interval.

Example Table: Frequency Distribution of Political Affiliation

Political Affiliation

Frequency

D

5

R

4

I

3

Additional info: Frequencies inferred from visible data.

Histograms, Bar Graphs, Pie Charts, and Dot Plots

  • Histogram: Displays frequency of data within intervals (useful for continuous data).

  • Bar Graph: Compares frequencies of categorical data.

  • Pie Chart: Shows proportions of categories as slices of a circle.

  • Dot Plot: Each data point is shown as a dot above its value on a number line.

Example Table: Pie Chart Data for Road Construction Funding

Response

Relative Frequency

Frequency

New Tolls

51%

Additional info: Frequency not specified

No New Roads

34%

Additional info: Frequency not specified

Increase Gas Tax

15%

Additional info: Frequency not specified

Descriptive Statistics from Grouped Data

Frequency Distribution and Histogram

Grouped data can be summarized using class intervals, midpoints, and frequencies.

  • Relative Frequency: Proportion of total observations in each class.

  • Cumulative Frequency: Running total of frequencies up to each class.

Example Table: Births by Age of Mother

Age of Mother (yrs)

Midpoints

Births (Frequency)

Relative Frequency

Cumulative Frequency

10-14.99

12.5

10

Additional info: 0.005

10

15-19.99

17.5

400

Additional info: 0.2

410

20-24.99

22.5

1050

Additional info: 0.525

1460

25-29.99

27.5

1200

Additional info: 0.6

2660

30-34.99

32.5

500

Additional info: 0.25

3160

35-39.99

37.5

100

Additional info: 0.05

3260

40-44.99

42.5

100

Additional info: 0.05

3360

Boxplots and Outlier Detection

Boxplot Construction

Boxplots visually display the five number summary and help identify outliers.

  • Draw a box from to with a line at the median.

  • Whiskers extend to minimum and maximum values within 1.5 × IQR.

  • Points outside whiskers are outliers.

Z-Scores and Standardization

Calculating Z-Scores

A z-score indicates how many standard deviations a value is from the mean.

Formula:

Example: For a female with weight 160 lbs, mean 155 lbs, standard deviation 50 lbs:

Data Analysis Examples

Descriptive Statistics for Egg Weights

Statistic

Value

Mean

1.615

Median

1.6

Mode

1.6

Standard Deviation

0.06514

Sample Variance

0.004245

Range

0.27

Minimum

1.47

Maximum

1.74

Additional info: Skewness and kurtosis values indicate the shape of the distribution.

Summary

  • Classify variables and understand measurement scales.

  • Distinguish between parameters and statistics.

  • Apply appropriate sampling methods.

  • Calculate and interpret mean, median, mode, standard deviation, variance, range, IQR, and z-scores.

  • Construct and interpret frequency tables, histograms, bar graphs, pie charts, dot plots, and boxplots.

  • Analyze grouped data and use descriptive statistics for data interpretation.

Pearson Logo

Study Prep