BackStatistics Review: Probability Distributions, Binomial & Normal Distributions, Proportions, Hypothesis Testing, and Chi-Square Tests
Study Guide - Smart Notes
Tailored notes based on your materials, expanded with key definitions, examples, and context.
Probability Distributions
Theoretical and Frequentist Distributions
Probability distributions describe the relative frequencies of outcomes of a random event. They are essential models for random experiments in statistics.
All possible outcomes of a random experiment must be listed.
Probability of each outcome must be specified.
Sum of all probabilities must equal 1 (unity).
Example: Probability distribution for rolling a fair die:
Number | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
Probability | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 | 1/6 |
Random Variables
Discrete Random Variables
Discrete random variables take on countable values.
Examples: Number of siblings, number of contacts stored on a phone.
Continuous Random Variables
Continuous random variables can take any value within a range and are not countable.
Examples: Height of basketball players, weight of an object, duration of phone calls.
Normal Distribution and Z-Scores
Finding Probabilities Using the Standard Normal Distribution
To find probabilities for normally distributed variables:
Convert values to z-scores using the formula:
Look up the corresponding area (probability) in the standard normal table (z-table).
Example: For grams, , : Probability for is 0.1056.
Using Technology (TI Calculators)
Compute z
Press 2ND VARS
Scroll to normalcdf(
Enter limits and parameters
Output is the p-value (probability)
Binomial Distribution & Discrete Random Variables
Conditions for Binomial Distribution
Fixed number of trials ()
Each trial has only two possible outcomes (success/failure)
Trials are independent
Probability of success () is constant for each trial
Binomial Probability Formula
The probability of getting successes in independent trials:
Use binomial formula, binomial tables, or technology (e.g., binompdf(n, p, x) on calculators).
Notation:
Binomial Syntax: binompdf(n, p, x) | x | P(x) |
|---|---|---|
For x=0; binompdf(4, 0.3, 0) | 0 | 0.240 |
For x=1; binompdf(4, 0.3, 1) | 1 | 0.412 |
For x=2; binompdf(4, 0.3, 2) | 2 | 0.265 |
For x=3; binompdf(4, 0.3, 3) | 3 | 0.076 |
For x=4; binompdf(4, 0.3, 4) | 4 | 0.008 |
Sampling Distributions & Confidence Intervals: Proportion
Sample Proportions: Conditions
Sampling is random and independent
Sample size is large enough: and
Population is at least 10 times larger than the sample (if sampling without replacement)
Sample Proportions: Conclusions
Sampling distribution of is approximately normal
Mean:
Standard deviation:
Estimated standard deviation:
Hypothesis Testing for One Sample: Proportions
Steps for Hypothesis Testing
Hypothesize Formulation: State and (e.g., , )
Prepare - Check CLT Conditions: Choose significance level , check randomness, sample size, and population size
Compute (Statistics) to Compare: Compute , then find p-value
Interpret and Conclude: Compare p-value with ; if p-value , reject
Example: If 40% of a sample of 200 own dogs, test if this is higher than 34% (p-value = 0.0367, so reject at )
Type I and Type II Errors
Type I Error: Rejecting when it is true
Type II Error: Not rejecting when it is false
Confidence Intervals (CI) for Proportions
Constructing a Confidence Interval
Check CLT conditions (random, large sample, large population)
Calculate estimated standard deviation:
Find for the desired confidence level (e.g., for 90%)
Margin of error:
CI:
Example: For , , 90% CI is (0.3222, 0.3778)
Sample Size for Desired Margin of Error
To achieve a margin of error at confidence level :
Sampling Distributions & Confidence Intervals: Mean
Population and Sample Variances
Sample Variance:
Sample Standard Deviation:
Sample Mean: Conditions
Random sample
Normality (population normal or large, usually )
Population at least 10 times larger than sample
Hypothesis Testing for One Sample: Means
Tailed vs. Two-Tailed Tests
Two-Tailed | One-Tailed (Left) | One-Tailed (Right) |
|---|---|---|
|
|
|
Steps for Hypothesis Testing (Means)
Formulate hypotheses (, )
Check CLT conditions (random, normality, large population)
Compute test statistic: , where
Find p-value and compare with
Interpret and conclude
Example: Testing if mean years of experience among nurses has increased (p-value = 0.0187, so reject at )
Chi-Square Tests & Goodness of Fit
Chi-Square Statistic
Measures how much observed frequencies differ from expected frequencies.
Formula:
Example Table: Observed vs. Expected
Girls | Boys | Total | |
|---|---|---|---|
Gym (Observed) | 20 | 50 | 70 |
No Gym (Observed) | 30 | 200 | 230 |
Total | 50 | 250 | 300 |
Girls | Boys | |
|---|---|---|
Gym (Expected) | 11.67 | 58.33 |
No Gym (Expected) | 38.33 | 191.67 |
Expected Frequency:
Calculated
Conditions for Chi-Square Tests
Data must be counts for categories
Random sample
Expected cell frequency at least 5
Counts are independent
Additional info:
These notes cover core topics from chapters 6-12 of a typical introductory statistics course, including probability distributions, binomial and normal distributions, sampling distributions, confidence intervals, hypothesis testing, and chi-square tests.