Welcome back, everyone. So we spent a lot of time learning how to conduct a hypothesis test of one sample when we're doing a proportion. But as you get deeper into the course, some of the problems you may start to run into will give you two sets of data or two samples and ask you to conduct a hypothesis test where you're looking at the difference between these two proportions. Now this might seem like it's gonna be twice as complicated or twice the amount of work, but, thankfully, all of the basic steps are exactly the same. We're gonna write some hypotheses, calculate some test statistics, which are z scores, find a p value, and then write a conclusion.

Alright? Now, of course, there are a few things that are a little bit different, but don't worry because I'm gonna walk you through all of this. We're gonna jump right into an example, and I'll show you how it works. Alright? Let's get started.

So, basically, the basic difference here is that in hypothesis tests with two samples instead of one, we're gonna test claims about the difference in the two proportions. So let's go ahead and just take a look at our problem over here. So this table that's shown to us on the right summarizes a study that's done on the success rate, so there's percentages, of a nicotine patch in helping people quitting smoking. So we're gonna do a hypothesis test within significance level α as 0.05 to determine if the proportion in these two subjects is different in the two groups. This is always how these problems will go.

They'll give you two samples and ask you to do a hypothesis test about the difference in the groups, and we're doing a proportion here. Alright? So remember, basically, all we're gonna do is go through the steps, but there are a few conditions that we wanna check first. Let's just go through them really quickly here. So we're just gonna check that these samples are random at independence.

Almost always, you can assume that these things are. We're gonna grab some random subjects, both, you know, which don't interfere with each other, and they're randomly sampled. Now we just have to check if there's five successes and failures in each sample so that they're normally distributed. Here we have 11 successfully quit out of 20 and 17 out of 23. So there's greater than five successes and failures in both.

This is a pretty quick thing to check off really quickly at the very beginning of the problem. Alright? So the first step here is we have to write our initial hypotheses, our H0 and our Ha. Alright? Now how does this work? Well, basically, what we had is when we had for one proportion over here, we always had that p was equal to some number that we pulled out of the problem.

It was 40% or 0.65 or something like that. But now that we're taking a look at two samples over here and we're testing claims about the difference between the two proportions, what you're always gonna do in these problems is you're gonna write your H0, your initial hypothesis, as p1 = p2. Your default assumption is that there's no difference in the proportions and they're exactly the same. Another way of writing this, by the way, is that the difference in proportions is actually just zero. p1 - p2 = 0.

We're gonna see some advantages of writing it this way in just a second here. Alright? So, basically, our initial hypothesis, our null hypothesis, is that p1 = p2, that the proportions of people who quit using these two things are exactly the same. Alright? Now what about the alternative hypothesis?

Here's we're gonna have to use either a less than, greater than, or not equal to sign. This works exactly the same. We're gonna have to figure out what it is based on the wording of the problem here. Now inside of this problem here, there's no indication that we're looking for, you know, evidence that the proportion is less than or greater than. We don't see any of those words.

So we're basically just gonna use a not equal sign like this. Alright? So by the way, this just means this is where we're gonna use a two-tailed test, and this is gonna be useful later on when we look at the distribution. Alright? Okay.

So now we're finished off with step one. We're gonna move on to finding out what our test statistic is. Now for one proportion, we calculated a z score using this equation for two samples. We're gonna use something that's a little bit more complicated, but, basically, it's actually kind of very similar. We always had a sample proportion, minus a parameter.

In this case, it's the same except this is a difference in sample proportion minus a difference in the parameter. And then we always had a square root thing on the bottom, which had some standard deviations, and we'll talk about that in just a second here. Alright? Now a lot of these problems will have many different numbers flying around or lots of numbers just indicated in different tables. It's always a good idea to get really organized with this, and figure out what your two groups of data are gonna be.

So in this case, what I've done here is I've labeled my first group as the placebo and the second as the patch, and I'm just basically gonna fill out my n's, x's, and p's, just using the data that's given to me already. So for the first one, they have 20 subjects and 11 who successfully quit. And then the second one, for the patch, I had 23 total subjects, and I had 17 who quit. That's just successes and the sample sizes. Alright?

Now, again, we're gonna use if you take a look at the first item that's in their first term that's in this z score equation, we're gonna have to take a look at the difference in the sample proportions. Now in order to calculate those, I can do that really quickly with the four numbers that I've just used. I've just written over here. This is just gonna be 11/20, which is 0.55. You just take x / n.

And then for p2, you're just gonna do x / n, 17/23, which ends up being 0.74. Alright? So that's basically just your p1 - p2. So that's the first term that goes inside of this equation over here. So we have a parenthesis, we've got 0.55 - 0.74.

So what we're gonna do is we're gonna subtract. Now this is just gonna be regular p1 - p2. So what's that? That's the difference in the population proportions of these two groups. What's that number?

Do I get it out of the table here? Do I get it out of the paragraph? Well, actually, no. Because we just wrote here that p1 = p2, that was your null hypothesis. Basically, what happens here is because you're assuming your default assumption is that there's no difference, then, therefore, p1 - p2 is always gonna be 0.

This is always going to be true. So you're basically gonna subtract the sample proportions, and there's always gonna have 0 inside of the second term. Alright? So then what goes on the bottom is just gonna be these big square roots, and, basically, what we're gonna use here is we're gonna use multiplication of p and q, except these p's and q's are a little bit different. They're not p and p.

These actually are a different term, which is called p. Now this p is basically one of the main differences in two sample proportions versus one. So the way we do this is we find a z score using a pooled or sometimes called a weighted sample proportion, which is basically just the total number of successes, divided by the total number of trials of both of the groups. So there's a quick little equation over here to calculate p and q, but, basically, you're just gonna add all the x's and then divide it by adding all the n's. So really all that happens here is to calculate p, you're just gonna use the total of the x's, which is just gonna be 11 + 17 divided by the total of the n's, which is gonna be 20 + 23.

If you go ahead and work this out, by the way, what you're gonna see here is that this p is equal to 28/43, which if you go ahead and just calculate this as a percentage, this ends up being 0.65. Alright? Now q is just the complement of this. This is just gonna be 0.35. Over here, you just take that and subtract it from one.

Alright? So now we need these numbers because we're gonna stick them inside of the square root that's in the bottom of our z score equation. So this is just gonna be 0.65 * 0.35 / n1, which is gonna be 20, + same two numbers, 0.65, 0.35 / n2, which is 23. Alright? Just be really careful when you calculate these things because there's a big square root on the bottom with a lot of terms.

Just make sure you're doing these things correctly in your calculator. But what you should see, by the way, when you do all this in your calculator is you should find a z score of -1.3. So -1.3 is your z score. Alright? So remember, that's the second step calculating your test statistic.

It's always gonna be a z score for these types of proportion problems. After that, what we're gonna do is everything else is exactly the same or how it was in a normal hypothesis test with one sample. After you find a z score, you're just gonna relate it to a probability, find out how unusual that sample is, and then write a conclusion. Alright? So the third step here is we're just gonna find a p value for the z score that we just found over here.

This works exactly the same way. Basically, we're trying to find is because we're using a two-tailed test, we're trying to figure out what is the area inside of the two tails of this region over here. So, basically, what we're gonna do here is this is gonna be 2 × the probability of finding a z score that is less than a z score of 1.3. And, again, you can go ahead and figure this out using a table or a calculator. I'm gonna assume that you already know how to do this because we've done this a bunch before.

What you're really gonna find here is a probability or a p value of 0.193. Alright? So this is gonna be your p value. This is your z score. And now finally, we're just gonna go ahead and write our conclusion.

Our conclusion is exactly the same for one or two proportion tests. You're gonna compare p to α and then either reject or fail to reject, and then have either enough or not enough evidence. Right? So we're gonna go ahead here and just compare our p value, which our p value here is less than our α, which is 0.05. So, therefore, because p is—I'm sorry.

It's greater than α. Oops. Sorry about that. So our p value is greater than α of 0.05. And so our conclusion is because our p value is greater than α, therefore, we fail to reject the null hypothesis that there's a difference in those two proportions.

Basically, what that means here is that there is not enough evidence that there is a difference in proportion between people quitting smoking using these two different sort of techniques. Alright? So that's start to finish how you do a hypothesis test with two proportions. Let me know if you have any questions here. Let's get some practice.

Thanks for watching.