Welcome back, everyone. So we spent a lot of time learning how to conduct a hypothesis test of one sample when we're doing a proportion. But as you get deeper into the course, some of the problems you may start to run into will give you two sets of data or two samples and ask you to conduct a hypothesis test where you're looking at the difference between these two proportions. Now this might seem like it's going to be twice as complicated or twice the amount of work, but, thankfully, all of the basic steps are exactly the same. We're going to write some hypotheses, calculate some test statistics, which are z scores, find a p-value, and then write a conclusion.
Alright? Now, of course, there are a few things that are a little bit different, but don't worry because I'm going to walk you through all of this. We're going to jump right into an example, and I'll show you how it works. Alright? Let's get started.
So, basically, the basic difference here is that in a hypothesis test with two samples instead of one, we're going to test claims about the difference in the two proportions. So let's go ahead and just take a look at our problem over here. So this table that's shown to us on the right summarizes a study done on the success rate, so there are percentages, of a nicotine patch in helping people quit smoking. So we're going to do a hypothesis test within a significance level alpha as 0.05 to determine if the proportion in these two subjects is different in the two groups. This is always how these problems will go.
They'll give you two samples and ask you to do a hypothesis test about the difference in the groups, and we're doing a proportion here. Alright? So remember, basically, all we're going to do is going to go through the steps, but there are a few conditions that we want to check first. Let's just go through them really quickly here. So we're just going to check that these samples are random at independence.
Almost always, you can assume that these things are. We're going to grab some random subjects, both, you know, which don't interfere with each other, and they're randomly sampled. Now we just have to check if there's five successes and failures in each sample so that they're normally distributed. Here we have, 11 successfully quit out of twenty and seventeen out of 23. So there's greater than five successes and failures in both.
So this is a pretty quick thing to check off really quickly at the very beginning of the problem. Alright? So the first step here is we have to write our initial hypotheses, our H0 and our H1. Now how does this work? Well, basically, what we had is when we had for one proportion over here, we always had that p was equal to some number that we pulled out of the problem.
It was 40% or 0.65 or something like that. But now that we're taking a look at two samples over here and we're testing claims about the difference between the two proportions, what you're always going to do in these problems is you're going to write your H0, your initial hypothesis, as p₁ = p₂. Your default assumption is that there's no difference in the proportions and they're exactly the same. Another way of writing this, by the way, is that the difference in proportions is actually just zero. \( p_1 - p_2 = 0 \).
We're going to see some advantages of writing it this way in just a second here. Alright? So basically, our initial hypothesis, our null hypothesis, is that p₁ is equal to p₂, that the proportions of people who quit using these two things are exactly the same. Alright? Now what about the alternative hypothesis?
Here's where we're going to have to use either a less than, greater than, or not equal to sign. This works exactly the same. We're going to have to figure out what it is based on the wording of the problem here. Now inside of this problem here, there's no indication that we're looking for, you know, evidence that the proportion is less than or greater than. We don't see any of those words.
So we're basically just going to use a not equal sign like this. Alright? So by the way, this just means this is where we're going to use a two-tailed test, and this is going to be useful later on when we look at the distribution. Alright? Okay.
So now we're finished off with step one. We're going to move on to finding out what our test statistic is. Now for one proportion, we calculated a z score using this equation for two samples. We're going to use something that's a little bit more complicated, but, basically, it's actually kind of very similar. We always had a sample proportion, minus a parameter.
In this case, it's the same except this is a difference in sample proportion minus a difference in the parameter. And then we always had a square root thing on the bottom, which had some standard deviations, and we'll talk about that in just a second here. Alright? Now a lot of these problems will have many different numbers flying around or lots of numbers just indicated in different tables. It's always a good idea to get really organized with this, and figure out what your two groups of data are going to be.
So in this case, what I've done here is I've labeled my first group as the placebo and the second as the patch, and I'm just basically going to fill out my n's, x's, and p's, just using the data that's given to me already. So for the first one, they have 20 subjects and 11 who successfully quit. And then the second one, for the patch, I had 23 total subjects, and I had 17 who quit. That's just successes and the sample sizes. Alright?
Now, again, we're going to use if you take a look at the first, item that's in their first term that's in this z score equation, we're going to have to take a look at the difference in the sample proportions. Now in order to calculate those, I can do that really quickly with the four numbers that I've just used. I've just written over here. This is just going to be \( \frac{11}{20} \), which is 0.55. You just take x over n.
And then for a p₂, you're just going to do \( x \) over \( n \) \( \frac{17}{23} \), which ends up being 0.74. Alright? So that's basically just your p₁ minus p₂. So that's the first term that goes inside of this equation over here. So we have a parenthesis, we've got 0.55 minus 0.74.