Skip to main content
Back

Foundations of Descriptive Statistics and Probability: Study Guide

Study Guide - Smart Notes

Tailored notes based on your materials, expanded with key definitions, examples, and context.

Descriptive and Inferential Statistics

Overview of Statistics

Statistics is the science of collecting, organizing, analyzing, and interpreting data to make informed decisions. It is broadly divided into two branches: descriptive and inferential statistics.

  • Descriptive Statistics: Methods for summarizing and organizing data using tables, graphs, and summary measures (e.g., mean, median, mode).

  • Inferential Statistics: Techniques for making predictions or inferences about a population based on a sample of data.

  • Parameter vs. Statistic: A parameter is a numerical summary of a population, while a statistic is a numerical summary of a sample.

  • Variables: Characteristics or properties that can take on different values. Variables can be categorical (qualitative) or quantitative (numerical).

Example: The average height of all students in a university is a parameter; the average height of a sample of 50 students is a statistic.

Organizing and Displaying Data

Frequency Tables

Frequency tables are used to organize data into categories or intervals, showing how often each value occurs.

  • For Categorical Data: List each category and the frequency (count) of observations in each.

  • For Quantitative Data: Group data into intervals (classes) and record the frequency for each interval.

Graphical Representations

  • Bar Chart: Used for categorical data; displays frequencies of categories as bars.

  • Pareto Chart: A bar chart where categories are ordered by frequency from highest to lowest.

  • Pie Chart: Shows the proportion of each category as a slice of a circle.

  • Histogram: Used for quantitative data; displays frequencies of data intervals as adjacent bars.

  • Stem-and-Leaf Plot: Shows quantitative data values in a way that sketches the distribution.

  • Boxplot: Visualizes the five-number summary (minimum, Q1, median, Q3, maximum) and identifies outliers.

Example: A histogram can show the distribution of test scores in a class, while a pie chart can show the proportion of students in different majors.

Types of Distributions

  • Symmetric: Data is evenly distributed around the center.

  • Skewed Right (Positively Skewed): Tail extends to the right; mean > median.

  • Skewed Left (Negatively Skewed): Tail extends to the left; mean < median.

  • Uniform: All values have approximately the same frequency.

  • Bimodal: Two distinct peaks in the distribution.

Measures of Central Tendency

Definitions and Calculations

  • Mean: The arithmetic average of a data set.

  • Median: The middle value when data is ordered.

  • Mode: The value(s) that occur most frequently.

Example: For the data set {2, 4, 4, 5, 7}, the mean is 4.4, the median is 4, and the mode is 4.

Measures of Variability

Definitions and Calculations

  • Range: Difference between the highest and lowest values.

  • Standard Deviation (s): Measures the average distance of data points from the mean.

  • Variance: The square of the standard deviation.

  • Interquartile Range (IQR): The range of the middle 50% of the data.

Five-Number Summary: Minimum, Q1, Median, Q3, Maximum.

Z-Score

  • Z-Score: Indicates how many standard deviations a value is from the mean.

Example: If a test score is 85, the mean is 80, and the standard deviation is 5, then .

Probability

Basic Probability Concepts

  • Sample Space (S): The set of all possible outcomes.

  • Event: A subset of the sample space.

  • Probability of an Event (A):

  • Complement: The event that A does not occur, denoted .

  • Mutually Exclusive Events: Events that cannot occur at the same time.

  • Independent Events: The occurrence of one event does not affect the probability of the other.

Rules of Probability

  • Addition Rule (for mutually exclusive events):

  • General Addition Rule:

  • Multiplication Rule (for independent events):

  • Conditional Probability:

Contingency Tables

Contingency tables display the frequency distribution of variables and are used to calculate probabilities involving two or more categorical variables.

Category 1

Category 2

Total

Group A

a

b

a+b

Group B

c

d

c+d

Total

a+c

b+d

n

Additional info: Entries a, b, c, d represent frequencies in each category.

Tree Diagrams

Tree diagrams are used to visualize all possible outcomes of a sequence of events and their associated probabilities.

Practice Problems and Applications

  • Identify whether a variable is categorical or quantitative, and if quantitative, whether it is discrete or continuous.

  • Construct and interpret frequency tables and various graphs (bar chart, histogram, pie chart, etc.).

  • Calculate measures of central tendency and variability for given data sets.

  • Interpret and compare distributions (symmetry, skewness, modality).

  • Apply probability rules to solve problems involving sample spaces, events, and conditional probability.

  • Use contingency tables and tree diagrams to solve multi-step probability problems.

Example Table: Frequency Distribution

Interval

Frequency

Relative Frequency

Percentage Distribution

Cumulative Distribution

78 to <79

2

0.04

4%

0.04

79 to <80

5

0.10

10%

0.14

80 to <81

8

0.16

16%

0.30

81 to <82

10

0.20

20%

0.50

82 to <83

15

0.30

30%

0.80

83 to <84

10

0.20

20%

1.00

Additional info: Table values are inferred for illustration; actual data may differ.

Example Table: Contingency Table for Cookies Sold

Rank

Lemonade

Thin Mints

Peanut Butter

Total

Daisies

142

126

130

398

Brownies

120

135

140

395

Cadettes

127

120

125

372

Total

389

381

395

1165

Additional info: Table values are inferred for illustration; actual data may differ.

Summary

  • Understand the distinction between descriptive and inferential statistics.

  • Be able to classify variables and data types.

  • Construct and interpret tables and graphs for data visualization.

  • Calculate and interpret measures of central tendency and variability.

  • Apply probability rules and solve problems using contingency tables and tree diagrams.

Pearson Logo

Study Prep