10. Hypothesis Testing for Two Samples

Two Means - Sigma Known Hypothesis Test - Excel

10. Hypothesis Testing for Two Samples

Two Means - Sigma Known Hypothesis Test - Excel: Videos & Practice Problems Bonus

Topic summary

When population standard deviations (σ₁, σ₂) are known, hypothesis testing for two population means uses the normal distribution and a z-test. The null hypothesis assumes equal means (μ₁ = μ₂), while the alternative tests if one mean is less than the other (μ₁ < μ₂). Calculate the z-score using the formula z=χ1̅−χ2̅σ2/n1+σ2/n2. Use the standard normal distribution to find the p-value and compare it to the significance level (α). Rejecting the null hypothesis indicates sufficient evidence that the means differ as hypothesized.

Downloads & Resources

concept

Two Means -Sigma Known Hypothesis Test - Excel

Video duration:

Two Means -Sigma Known Hypothesis Test - Excel Video Summary

When conducting a hypothesis test for two population means with known population standard deviations, the normal distribution and a z test are used instead of the t distribution. This approach applies when the population standard deviations, denoted as σ₁ and σ₂, are known values. Unlike the t-test function in Excel, which can directly compute p-values for two means without known standard deviations, the z-test for two means requires manually calculating the test statistic and then finding the p-value using the standard normal distribution.

Consider a scenario where a manufacturing company wants to determine if machine A produces fewer widgets per batch on average than machine B. Given sample data from 30 batches for each machine, and known population standard deviations σ₁ = 9.73 and σ₂ = 5.91, a hypothesis test at a significance level α = 0.05 can be performed.

The null hypothesis (H₀) states that the two population means are equal: $H_0: \mu_1 = \mu_2$, where $\mu_1$ and $\mu_2$ represent the average widgets produced by machines A and B, respectively. The alternative hypothesis (H₁) reflects the company's concern that machine A produces less: $H_1: \mu_1 < \mu_2$.

To calculate the z test statistic, first find the sample means $\bar{x}_1$ and $\bar{x}_2$ using the average function in Excel. For example, if $\bar{x}_1 = 42.43$ and $\bar{x}_2 = 45.97$, the numerator of the z formula is the difference between these means:

\[\bar{x}_1 - \bar{x}_2 = 42.43 - 45.97 = -3.54\]

The denominator involves the standard errors of the means, calculated using the known population standard deviations and sample sizes (\(n_1 = n_2 = 30\():

\[\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}} = \sqrt{\frac{9.73^2}{30} + \frac{5.91^2}{30}} = \sqrt{3.156 + 1.164} = \sqrt{4.32} \approx 2.08\]

Thus, the z score is:

\[z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}} = \frac{-3.54}{2.08} \approx -1.70\]

Since the alternative hypothesis is one-sided (\)\mu_1 < \mu_2\)), the p-value corresponds to the cumulative probability to the left of the z score. Using Excel’s NORM.S.DIST function with the z score and cumulative set to TRUE yields a p-value of approximately 0.044.

Comparing the p-value to the significance level α = 0.05, since 0.044 < 0.05, the null hypothesis is rejected. This provides sufficient evidence to conclude that machine A produces fewer widgets on average per batch than machine B.

This method highlights the importance of correctly identifying when to use a z test versus a t test based on knowledge of population standard deviations. It also demonstrates how breaking down the z test formula into smaller components can reduce calculation errors and improve clarity. Excel functions such as AVERAGE and NORM.S.DIST facilitate efficient computation of sample means and p-values, making hypothesis testing more accessible and accurate.

example

Two Means -Sigma Known Hypothesis Test - Excel Example 1

Video duration:

Two Means -Sigma Known Hypothesis Test - Excel Example 1 Video Summary

When comparing the average number of volunteers between two locations, such as a local animal shelter and a food pantry, hypothesis testing can determine if one location consistently receives more volunteers than the other. Given population standard deviations (σ₁ = 5.36 for the animal shelter and σ₂ = 4.25 for the food pantry) and equal sample sizes (n₁ = n₂ = 50), a two-sample z-test is appropriate to evaluate the claim that the animal shelter receives more volunteers on average.

The first step is to establish the hypotheses. The null hypothesis (H₀) assumes no difference in means: $μ_1 = μ_2$, where $μ_1$ is the mean number of volunteers at the animal shelter and $μ_2$ is the mean at the food pantry. The alternative hypothesis (Hₐ) reflects the claim that the animal shelter has a higher average number of volunteers: $μ_1 > μ_2$.

Next, calculate the sample means from the collected data. For example, the animal shelter’s sample mean ($\bar{x}_1$) might be 17.1 volunteers per week, while the food pantry’s sample mean ($\bar{x}_2$) is 15.12 volunteers per week. These values form the basis for the test statistic.

The z-test statistic is computed using the formula:

\[z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}\]

Here, the numerator is the difference between the sample means, and the denominator is the standard error of the difference, which accounts for the population variances divided by their respective sample sizes. Squaring the population standard deviations and dividing by the sample sizes yields the components inside the square root. For instance, $\frac{\sigma_1^2}{n_1} = \frac{5.36^2}{50} \approx 0.57$ and $\frac{\sigma_2^2}{n_2} = \frac{4.25^2}{50} \approx 0.36$. Plugging these into the formula gives a z-score of approximately 2.05.

To interpret this z-score, convert it to a p-value, which represents the probability of observing such a difference (or more extreme) if the null hypothesis were true. Since the alternative hypothesis is one-sided (\(μ_1 > μ_2\(), the p-value corresponds to the right-tail probability of the standard normal distribution. Using the cumulative distribution function (CDF) for the standard normal distribution, the p-value is calculated as:

\[p = 1 - \Phi(z)\]

where \)\Phi(z)\) is the CDF value at the calculated z-score. For $z = 2.05$, the p-value is approximately 0.02.

Comparing the p-value to the significance level ($\alpha = 0.1$), since \$0.02 < 0.1$, the null hypothesis is rejected. This statistical evidence supports the claim that the animal shelter receives more volunteers on average than the food pantry.

Understanding this process highlights the importance of formulating clear hypotheses, calculating appropriate test statistics using known population parameters, and interpreting p-values in the context of significance levels to make informed decisions based on data.

example

Two Means -Sigma Known Hypothesis Test - Excel Example 2

Video duration:

Two Means -Sigma Known Hypothesis Test - Excel Example 2 Video Summary

When comparing the average amounts of milk dispensed at two different locations, a hypothesis test can determine if there is a significant difference between the two means. Given population standard deviations (σ₁ = 0.46 and σ₂ = 0.55) and sample sizes (n₁ = n₂ = 50), a two-sample z-test is appropriate for this analysis. The null hypothesis (H₀) states that the mean amounts dispensed at both locations are equal, expressed as $ \mu_1 = \mu_2 $. The alternative hypothesis (Hₐ) suggests that the means are not equal, or $ \mu_1 \neq \mu_2 $, indicating a two-tailed test.

To perform the test, first calculate the sample means for each location, denoted as $ \bar{x}_1 $ and $ \bar{x}_2 $. The difference between these sample means forms the numerator of the z-test statistic:

\[\text{Numerator} = \bar{x}_1 - \bar{x}_2\]

The denominator involves the standard error of the difference between means, calculated using the population standard deviations and sample sizes:

\[SE = \sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}\]

The z-test statistic is then computed as:

\[z = \frac{\bar{x}_1 - \bar{x}_2}{SE}\]

Once the z-score is obtained, the p-value is determined based on the two-tailed nature of the test. Since the alternative hypothesis is non-directional ($ \mu_1 \neq \mu_2 $), the p-value is twice the smaller tail probability corresponding to the calculated z-score. This can be found using the standard normal cumulative distribution function (CDF), denoted as $ \Phi(z) $. For a negative z-score, the p-value is:

\[p = 2 \times \Phi(z)\]

Comparing the p-value to the significance level (α = 0.01) guides the decision-making process. If the p-value is greater than α, there is insufficient evidence to reject the null hypothesis, indicating no significant difference in the average amounts dispensed between the two locations. Conversely, a p-value less than α would suggest a statistically significant difference.

In this scenario, the calculated z-score was approximately -0.63, leading to a p-value around 0.53, which exceeds the 0.01 threshold. Therefore, the conclusion is to fail to reject the null hypothesis, meaning the data do not provide strong enough evidence to claim that the two dispensing locations pour different average amounts of milk per bottle.

Do you want more practice?

More sets

10. Hypothesis Testing for Two Samples

3 topics 6 problems

Chapter

Brendan

Here’s what students ask on this topic:

When population standard deviations (σ₁ and σ₂) are known, you perform a two-mean hypothesis test using a z-test and the normal distribution. First, state your null hypothesis (H₀: μ₁ = μ₂) and alternative hypothesis (e.g., H₁: μ₁ < μ₂). Calculate the sample means using Excel's =AVERAGE(range) function. Then compute the z-score using the formula: $z = \frac{̅x 1 - ̅x 2}{\sqrt{σ^{2} \frac{1}{n_{1}} + σ^{2} \frac{1}{n_{2}}}}$ . Use Excel to calculate each part to avoid errors. Finally, find the p-value with =NORM.S.DIST(z, TRUE) for a left-tail test. Compare the p-value to your significance level α to decide whether to reject H₀.

The z-score formula for testing the difference between two population means when the population standard deviations (σ₁ and σ₂) are known is: $z = \frac{̅x 1 - ̅x 2}{\sqrt{σ^{2} \frac{1}{n_{1}} + σ^{2} \frac{1}{n_{2}}}}$ . Here, $̅x 1$ and $̅x 2$ are the sample means, σ₁ and σ₂ are the known population standard deviations, and n₁ and n₂ are the sample sizes. This formula calculates how many standard errors the difference between sample means is from zero under the null hypothesis.

After calculating the z-score for your two means test, you can find the p-value in Excel using the =NORM.S.DIST(z, TRUE) function. This function returns the cumulative probability up to the z-score, which corresponds to the left-tail probability. For a left-tailed test (e.g., H₁: μ₁ < μ₂), this p-value directly indicates the probability of observing a test statistic as extreme as the one calculated. For right-tailed or two-tailed tests, you can adjust by subtracting from 1 or doubling the tail probability. Comparing this p-value to your significance level α helps you decide whether to reject the null hypothesis.

You use a z-test instead of a t-test when the population standard deviations (σ₁ and σ₂) are known because the z-test relies on the normal distribution, which is appropriate when the variability in the populations is precisely known. The t-test is used when population standard deviations are unknown and must be estimated from the sample data, introducing extra uncertainty. Knowing σ₁ and σ₂ allows for a more exact calculation of the standard error, making the z-test the correct choice for hypothesis testing of two means in this scenario.

To perform a two means hypothesis test with known population standard deviations in Excel, follow these steps: (1) Define your null hypothesis (H₀: μ₁ = μ₂) and alternative hypothesis (e.g., H₁: μ₁ < μ₂). (2) Calculate the sample means using =AVERAGE(range). (3) Input the known population standard deviations (σ₁, σ₂) and sample sizes (n₁, n₂). (4) Compute the z-score using the formula: $z = \frac{̅x 1 - ̅x 2}{\sqrt{σ^{2} \frac{1}{n_{1}} + σ^{2} \frac{1}{n_{2}}}}$ . (5) Use =NORM.S.DIST(z, TRUE) to find the p-value. (6) Compare the p-value to your significance level α to decide whether to reject H₀.

Your Statistics tutors

Patrick Ford

Physics and Math Lead Instructor

Two Means - Sigma Known Hypothesis Test - Excel: Videos & Practice Problems Bonus

Downloads & Resources

Two Means -Sigma Known Hypothesis Test - Excel

Two Means -Sigma Known Hypothesis Test - Excel Video Summary

Two Means -Sigma Known Hypothesis Test - Excel Example 1

Two Means -Sigma Known Hypothesis Test - Excel Example 1 Video Summary

Two Means -Sigma Known Hypothesis Test - Excel Example 2

Two Means -Sigma Known Hypothesis Test - Excel Example 2 Video Summary

Do you want more practice?

Here’s what students ask on this topic:

How do you perform a hypothesis test for two means when the population standard deviations are known using Excel?

What is the formula for the z-score in a two means hypothesis test when population standard deviations are known?

How do you calculate the p-value for a two means z-test in Excel?

Why do you use a z-test instead of a t-test when population standard deviations are known?

What are the steps to perform a two means hypothesis test with known population standard deviations in Excel?

Your Statistics tutors