4 & 5. Statistics, Quality Assurance and Calibration Methods

# The Gaussian Distribution

Performing an experiment numerous times with no systematic error results in a smooth curve called the *Gaussian Distribution.*

The Gaussian Distribution & Z-Table

1

concept

## Gaussian Distribution

4m

Play a video:

Was this helpful?

So here we're gonna talk about a Gaussian distribution. Now, we're gonna say performing an experiment numerous times and this is important with no systematic error results in a smooth curve called our Gaussian distribution. So here we have an image of a typical Gaussian Gaussian distribution curve, it has some important variables that we're gonna talk about. So we have fx which is our function and this is its formula. What's important about this formula is that it contains within it some important variables, which I've highlighted. So here we have these three variables and we're gonna say in terms of the Gaussian distribution curve, increasing the number of measurements in the experiment, it's gonna change my mean, which is X. With this line above it, to meu So meu here represents the population average or the mean. Now this is important, the population average or the mean always represents the exact center of my Gaussian distribution. So wherever it is, that the exact center of my Gaussian distribution, and we're gonna say here, the more numbers we get our standard deviation, which is s transitions into sigma where it's now called the population standard deviation. So realize that the more measurements we have these terms take on the new term of population to represent the numerous amounts of data sets that are involved. Now, the standard deviation is just looking at how far away from the mean, or the average, the center of the Gaussian distribution curve, um will our measurements be and the standard deviation. This population standard deviation could be more to the left or more to the right, we're just looking at how far away it is from the center or the mean of my Gaussian distribution curve. Now here the shape of the Gaussian distribution curve can be affected by these variables. Now, before we say that X here would just represent my population as a whole. So all of the data sets or members or whatever, you're calculating all those measurements would be represented by X. Now the shape of the Gaussian distribution curve can also be affected by some of these variables were going to say here that if we change our population mean? So r mu it will shift the distribution curve to the left or to the right. Remember I just said that the mean, the population mean is the exact center of my curve. So if I were to place my average around here, now that would mean that the exact center of my Gaussian distribution curve would have to line up with that new average, so the curve would look different and this would be exact center or if I my curve is over here now, that's because the exact center of it, that's where my mean, is my population mean, So wherever the mean is that can affect if the Gaussian distribution will be more to the left or to the right, then we're gonna stay here, changing my population standard deviation. So sigma will increase or decrease the broadness of the distribution curve. So if we have a distribution curve that's very broad, that's because the population standard deviation is very large. Okay, so the larger standard deviation, population standard deviation, then the more broad my Gaussian distribution curve will be, and then if you have a very narrow a very narrow Gaussian distribution curve, that would mean that you have a very small population standard deviation. Okay, so here we're gonna say we have a low sigma or population standard deviation, it would make a narrow curve. And the larger your population standard deviation it gives you a broad curve. Remember, standard deviation in itself measures how closely measurements are to one another. It's testing their precision. Okay, so that's what it's looking at. So the less precise your numbers are, the larger the population standard deviation will be, the more broad the curve will be. Now that we've gotten down some of the fundamentals of the Gaussian distribution curve, come back and take a look at our next video, where we go a little bit more into detail on how we break down and look at some of the numbers associated with a typical Gaussian distribution curve.

2

example

## Gaussian Distribution

7m

Play a video:

Was this helpful?

Alright, so taking a closer look at a Gaussian distribution curve, we're gonna say normally each distributed variable has its own mean and standard deviation, we're gonna say the standard normal distribution, which we're gonna use, simplifies this by setting the mean. So remember that's the direct center of my Gaussian distribution curve at zero, and then standard deviations will be in units of one. Okay, so here, if we're taking a look, remember this right here represents my mean, my population mean, which is mu and then the distance away from it represents my population standard deviations. So here you can either be minus one from it or minus two or minus three from it, or it can be plus one plus two or plus three away from the center of the Gaussian distribution curve, which is our population mean? Now here we can also say that these standard deviations also have connected to them a Z score. So here are Z scores. Line up with each one of these population standard deviations. Now we have some formulas that are associated with this type of of simplified Gaussian distribution curve. We have our standard normal distribution formula, um which is why equals E to the negative Z two divided by two, divided by two, um square root of two pi And we're gonna stay here, Z is just your Z score. How would we figure out the Z score? Well here our Z score, which is also called are obsessed, is value the value you're looking at minus the mean, divided by the standard deviation later on, we'll see how to use and employ this formula, in order to relate it to any type of Gaussian distribution curve. Now, when it comes to this Gaussian distribution curve and we're gonna need room to write this. Guys, we're gonna say that a certain percentage of a population will fall within a given percentage within this Gaussian distribution. So if we take a look here, we're gonna say blank amount of data falls between minus one to plus one area. So what I'm talking about here is that in this Gaussian distribution curve, if we're taking the whole thing into consideration, it represents 100% of a given population Here, I'm talking about plus one two plus one. So we're talking about just this portion of the Gaussian distribution curve So the percentage of the population that would fall within that segment of a Gaussian distribution curve is on average 68%. Here, we'd say that Each section from 0 to plus one, it represents 34.13 And 02 -1 equals 34.13%. So that's all we say about an average of 68%. Then if we take into consideration Everything from -2 to plus two. So, as you can see, we're including more and more of the population, This would include approximately 95% of the population From +12 plus two. This, on average is around 13.59%. This is also 13.59%,, then we'd say here if we take into consideration um plus minus three to plus three area. So this portion here? Mhm. That on average represents about 99.7% of the total population. The small segments here are 2.28% or so. So basically from negative to positive three represents almost the entire population. Now I'm throwing these numbers out. So let's think of an example, let's say we're talking about average I. Q. And let's say say that the average I. Q. Represents 101 100 points on the I. Q. Test. So that 100 will represent my mean or average. And let's say that it had a standard deviation. Popular population deviation of 15, what would that mean? Well that standard deviation would tell me um to the left and to the right what to do? All right, so 100 is my mean, the standard deviation is 15. So that means that each one of these points is a standard deviation of 15. So here for plus one would be 1 15 because I added 15 And then I would add another 15. So that's 1:30 and then for plus three add another 15. So that would be 1 45 In the opposite way i subtract. So 100 -15 is 85 for -2 weeks. subtract another 15, so that's 70 And we subtract another uh 15. So that's gonna give me 55. So what are these numbers telling me? Well if we're using this convention right here that we just talked about. So 60% of the data falls between minus one to plus one area. That would mean that. So minus one to plus one. That would mean that 68% of the population has an I. Q. between 85 to 1 15 95% falls within the -2 to plus two area. So -2 two plus two. Would mean that 95% of population has an I. Q. Between 70 to 1 30. Then finally 97% would fall between 97.99.7% will fall between -3 and plus three. So -3 plus three. So that would mean that 99.7% of the population has an I. Q. Between 55 1 45. That's how we use the Gaussian distribution to basically give up probabilities. So these are just probabilities. 60% of the population has a probability of having an I. Q. Between 85 1 15. Um Now notice here that this this is not 100%. That's because of course some people lie outside these probabilities. They are extremely rare individuals in terms of this example. So they will have IQ's that are incredibly low, much lower than 55 or have IQ's off the scale above 1 45. Good example. Einstein, Einstein Is rumoured to have a recorded i. q. of over 160, Meaning he is off scale. He represents that .3%. That doesn't fall between the Gaussian distribution because his I. Q. is so great. It goes beyond the parameters of these equations and the Gaussian distribution. As we move more and more into probability and statistics, we'll take a closer look at further Gaussian distribution um examples and look how to use it in answering any type of question we come across so hopefully you guys are able to follow along and continue onward as we continue to look at Gaussian distributions, statistics and some of the applications that you can use to solve any type of question.

3

example

## The Gaussian Distribution & Z-Table

2m

Play a video:

Was this helpful?

So the Gaussian distribution curve gives us the probability of a percentage of a population falling within certain parameters. When we're given a Z score now to find the scores we use Z tables Now remember on the previous page that Z equals the value that we're looking at minus are mean or population mean divided by our standard deviation. Now in this image we have a Z score here Here, let's say that that Z score equals negative 1.65. Remember all the numbers to the left of zero will be negative. All the numbers to the right will be positive. Using the Z score table here we have negative 33.4 which is around here all the way to zero. So this Z table looks at the negative values for Z. Here we have a score of negative 1.65, realize here that when it comes to the Z scores here we're looking at the first two digits and then here these numbers helped to give us our third digit. Okay, So we have negative 1.65. So negative 1.6 is here And then we just have to pop find .05 which is right here and then we see where they meet. So both of them meet right here. So that's gonna give us our p value which is our probability for the percentage of the population that's gonna fit within those parameters. So here p equals .0495. So what does that mean? That's the decimal form? So we'd multiply it by 100. So that would be 4.95%. So what that's telling us here is that if we were to shade in this portion here, It means that 4.95% would fall within those parameters. From negative 3.4 up to negative 1.6 here. It's not drawn to scale Because it looks like that percentage is bigger than 4.95%. But again, I just chose an arbitrary number. I just chose negative 1.65. Just to show how we'd use the Z table in order to figure out what our probability would be. Remember. This first portion looks at the negative aspect of the Gaussian distribution curve, click on the next video and see what we would do when we're looking at Z scores that are to the right of my population mean of zero. We employ the same method, but let's just see what would happen in those cases.

4

example

## The Gaussian Distribution & Z-Table

2m

Play a video:

Was this helpful?

So here we have a Z score that now falls To the right of my population mean of zero. Now let's just come up with an arbitrary number again. So we'll say Z here equals 1.82 for that value. So remember the first two digits from my Z score are found here in this column. So we have to look for 1.8 And then the third digit, the two we look here at .02, we'd see where they would meet up. So they'd meet up here. That means that my probability p equals .9656. So that would mean times 196.56%. So that means here that if we're taking a consideration all the negative portion of my Gaussian distribution curve plus this portion up to Z. That would represent 96.56% of my entire population. Now besides that, what if they were to ask me um what is the probability if we're looking at Z equals zero to Z equals 1.82. In that case we'd only be looking at this portion of my Gaussian distribution curve. Okay, so we'd only be looking at this portion here. Now remember the mean represents the exact center of my Gaussian distribution curve? That would mean that this portion here represents 50 And then the other half represents 50%. So we're not looking at the 50%, the part that's the left of my population mean. So all we have to do to figure out what's the percentage between these two Z values is just do 96.56% -50%. And that would give me 46.56% of the population would fall between a Z score of 0 to Z score of 1.82. So again, just remember, in order to determine the probability of a population falling within certain parameters, we have to find our Z. Score and then compare it to our Z. Table in order to find our value P, which represents the probability the percentage that we're looking for.

The Gaussian Distribution & Z-Tables Calculations

5

example

## Gaussian Distribution & Z-Table

3m

Play a video:

Was this helpful?

So here it states suppose there are 100 students in your analytical lecture and at the end of the semester, the class average is an 80 with a standard deviation of 5.3, determine the distribution and probability of grades based on your understanding of the Gaussian distribution curve. Alright, so here they're telling me that my class average which will represent my population mean is 80 so that would be here, So 80%. The standard deviation means that it moves in both directions in units of 5.3. So here plus one would mean we're adding 5.3, so now it's 85.3, Then we add another 5.32, give me 90.6, and then finally we add another three, another 5.3. That would give me 95.9, Then going the opposite direction, we're gonna subtract 5.3 each time. So subtracting 5.3 the first time gives me 74.7, Then subtracting another 5.3, gives me 69.4, and then finally subtracting 5.31 final time gives me 64.1. So that would be how we fill out our galaxy distribution curve. So remember when it comes to this Gaussian distribution curve, we can look at it in terms of the plus one to minus one or minus one to plus one area, We can look at it as the -2 to plus two area, And then finally the -3 to plus three area. Remember here this represents 68% of our total population. So 68% of the class. Then here, we'd say that this here represents 95% of the class. So this would be 68 people This year, would be 95 people This year would represent 99.7%. So basically everybody, all 100 people. So if we look from minus one to plus one, we'd say that 68% of the class is expected to get a grade between 74.7% to 85.3% for plus two, two from minus two to plus two. We'd say that 95% of the class is expected to get a grade between 69.4% to 90.6%. And then finally, we expect 99.7%. Which is basically everybody To get a grade between 64.1% to 95.9%. So that's what we can say in terms of the distribution of grades or distribution of grades would be the percentage and the number of people and our probability would be the actual percentages that we get here. And that's how we approach this first one. Now that you've seen this example, click onto the next video to see example two and see how we tackle this when we're looking for the percentage when given an exact grade. Okay,

6

example

## Gaussian Distribution & Z-Table

2m

Play a video:

Was this helpful?

So continue with our talk of the Gaussian distribution that we saw in the above example we say from example one determine the percentage of final grades. That would lie below 71. Alright, so remember our population mean is 80. Our standard deviation is 5.3 which means we add 5.3 for each point To the right of my population mean and then we subtract 5.3 to the left, That gives us a range between 64.1-95.9 which represents the entire student population. Now here were looking for below 71. So that means we're gonna really look for this portion of my Gaussian distribution curve. Now in order to figure out my percentage or my probability we're gonna figure out first what our Z score is. So remember your Z score is equal to your value which in this case of the 71 minus your population mean? Of 80 divided by your stand, your population standard deviation. So that's gonna give me here 71 -80, divided by 5.3. That initially gives me negative 1.6981. But remember when using the Z table from the previous pages we only have three digits that can be used for finding R. P. Or probability value. So this would round to negative 1.70 for our Z score. Using that Z score we just have to find negative 1.7 and lining up with the third digit of zero if you do that correctly it gives you a p. Or probability equal to 0.446. When you multiply that by 100 That tells me that 4.46% of the of the final grades would lie between 71. Which isn't too bad. Now that you've seen that example. Look to see if you can answer the practice question below it again. Don't worry as usual. Um just come back and take a look at how I approach that practice problem to get the final answer.

7

Problem

From EXAMPLE 1, determine the percentage of final grades that would lie between 88 to 92.

A

5.36 %

B

5.33 %

C

5.39 %

D

5.49 %