2 Sample t-Test for Means
Research Question: Is there a relationship between median household income and how 15 year old students do on the math section of the PISA exam?
It can be tricky deciding which type of statistical inference test to use. Here is a flow chart below to analyze when to use what kind of test:
It can be tricky deciding which type of statistical inference test to use. Here is a flow chart below to analyze when to use what kind of test:
Two Sampled t-Test for Means:
To continue our investigation more thoroughly, we will use a two sampled t-test for means. There are similarities between z-tests and t-tests. However, there is one key distinction. A z-test relies on the standard deviation from the population while a t-test relies on the standard deviation from the samples. How do we know if we have the standard deviation of our population? First, we need to determine what our population is.
We know our samples are the countries who have taken the PISA exam. How can we generalize our research question broadly?
We could say that we want to know if median household income in any country contributes to the 15 year olds' skills in math on the PISA exam. Then, our population would be every country on Earth. However, we do not have the statistics from every country. Thus, we do not know the population standard deviation. This means we can use the sample standard deviation to help us with our inference!
This test will compare the two means of group A and group B and see if the difference of their means is significantly different.
First, we need to find the mean and standard deviation for each group. The formula for finding standard deviation is:
To continue our investigation more thoroughly, we will use a two sampled t-test for means. There are similarities between z-tests and t-tests. However, there is one key distinction. A z-test relies on the standard deviation from the population while a t-test relies on the standard deviation from the samples. How do we know if we have the standard deviation of our population? First, we need to determine what our population is.
We know our samples are the countries who have taken the PISA exam. How can we generalize our research question broadly?
We could say that we want to know if median household income in any country contributes to the 15 year olds' skills in math on the PISA exam. Then, our population would be every country on Earth. However, we do not have the statistics from every country. Thus, we do not know the population standard deviation. This means we can use the sample standard deviation to help us with our inference!
This test will compare the two means of group A and group B and see if the difference of their means is significantly different.
First, we need to find the mean and standard deviation for each group. The formula for finding standard deviation is:
1. Find the mean test score for group A and group B
2. Take the difference between each country's math score and the mean for its respective group. Then square each of these values
3. Sum the squared residual values for each group, so we should have a sum from group A and a sum from group B
4. Divide this value by the number of data points
5. Take the square root
After doing this, we find:
Group A:
Average: 500.32
Standard Deviation: 34.741
Group B:
Average: 446.28
Standard Deviation: 37.932
Already, we can see that the average math score for group A is higher than group B's, and group B's test scores are more spread out because its standard deviation is larger. But is this data enough to support our claim that median income affects the PISA math score?
2. Take the difference between each country's math score and the mean for its respective group. Then square each of these values
3. Sum the squared residual values for each group, so we should have a sum from group A and a sum from group B
4. Divide this value by the number of data points
5. Take the square root
After doing this, we find:
Group A:
Average: 500.32
Standard Deviation: 34.741
Group B:
Average: 446.28
Standard Deviation: 37.932
Already, we can see that the average math score for group A is higher than group B's, and group B's test scores are more spread out because its standard deviation is larger. But is this data enough to support our claim that median income affects the PISA math score?
Note: If we didn't want to compute the standard deviation by hand, we can also do this using our calculator.
By plotting our data in the lists as mentioned in the bivariate section of this website, we can then click: Stat, scroll over to "Calc", and click on 1-var statistics. This will give us an output of the overview our data. It will tell us the mean of our sample, the standard deviation, 5-number summary, and more!
By plotting our data in the lists as mentioned in the bivariate section of this website, we can then click: Stat, scroll over to "Calc", and click on 1-var statistics. This will give us an output of the overview our data. It will tell us the mean of our sample, the standard deviation, 5-number summary, and more!
Now, let's begin our test!
We first start with our null and alternative hypothesis. Our null hypothesis is what we are trying to find evidence for to reject. If we cannot reject our null, then we "fail to reject." It is a faux pas to accept our alternative hypothesis. Let's look at our null and alternative hypotheses below.
Ho: Ua - Ub = 0
Ha: Ua - Ub ≠ 0
Ua = average math test score for the PISA exam in 2012 for 25 countries having higher median household income (group A)
Ub = average math test score for the PISA exam in 2012 for 25 countries having lower median household income (group B)
Our whole point of doing this test is to see whether or not the averages of the math scores between the two groups based on median income are similar enough or not. Thus, our null hypothesis claims that the averages are the same. Reworded, we are testing to see if the difference between the two means is not different from 0.
Therefore, our alternative hypothesis claims that the difference in the averages is different from 0. This would imply that there averages are significantly different from each other.
As stated above, we cannot accept our alternative hypothesis if we find evidence against the null. Instead, we would state that we have evidence to suggest that there is a difference between the two averages.
So how do we know if we can accept our null hypothesis or not? We need to create some reference value to compare our t-statistic to. (Stay with me, there's a lot of new vocab!)
We will set our significance value (also known as alpha) to .05. This value can also be set to .01 or .1 depending on what the statistician prefers.
So, let alpha = .05. Our alpha value is what we will compare our p-value to. This allows us to see how significant our t-statistic is.
State: We want to perform a 2-tailed t-test for difference of means
Certain conditions must be met in order to continue our test. If they are not met, then our conclusion may not be valid.
Conditions:
1) Normality
Both groups are normal
OR
CLT:
n1 ≥ 30
n2 ≥ 30
Although both of our sample sizes are 25, we can check normality by looking at the Normal Probability Plot (NPP). It's a specific plot on our calculator that allows us to visually see if our data points can be deemed normal or not!
To check this, click on "2nd" > "stat plot" and click on plot 1. We want to choose the plot that's all the way on the right side.
Under data list, type in whichever list contains your data for group A. Data axis can be left as X. Then, click on "zoom" and select "ZoomStat."
Here are the NPP's for group A (left) and B (right):
We first start with our null and alternative hypothesis. Our null hypothesis is what we are trying to find evidence for to reject. If we cannot reject our null, then we "fail to reject." It is a faux pas to accept our alternative hypothesis. Let's look at our null and alternative hypotheses below.
Ho: Ua - Ub = 0
Ha: Ua - Ub ≠ 0
Ua = average math test score for the PISA exam in 2012 for 25 countries having higher median household income (group A)
Ub = average math test score for the PISA exam in 2012 for 25 countries having lower median household income (group B)
Our whole point of doing this test is to see whether or not the averages of the math scores between the two groups based on median income are similar enough or not. Thus, our null hypothesis claims that the averages are the same. Reworded, we are testing to see if the difference between the two means is not different from 0.
Therefore, our alternative hypothesis claims that the difference in the averages is different from 0. This would imply that there averages are significantly different from each other.
As stated above, we cannot accept our alternative hypothesis if we find evidence against the null. Instead, we would state that we have evidence to suggest that there is a difference between the two averages.
So how do we know if we can accept our null hypothesis or not? We need to create some reference value to compare our t-statistic to. (Stay with me, there's a lot of new vocab!)
We will set our significance value (also known as alpha) to .05. This value can also be set to .01 or .1 depending on what the statistician prefers.
So, let alpha = .05. Our alpha value is what we will compare our p-value to. This allows us to see how significant our t-statistic is.
State: We want to perform a 2-tailed t-test for difference of means
Certain conditions must be met in order to continue our test. If they are not met, then our conclusion may not be valid.
Conditions:
1) Normality
Both groups are normal
OR
CLT:
n1 ≥ 30
n2 ≥ 30
Although both of our sample sizes are 25, we can check normality by looking at the Normal Probability Plot (NPP). It's a specific plot on our calculator that allows us to visually see if our data points can be deemed normal or not!
To check this, click on "2nd" > "stat plot" and click on plot 1. We want to choose the plot that's all the way on the right side.
Under data list, type in whichever list contains your data for group A. Data axis can be left as X. Then, click on "zoom" and select "ZoomStat."
Here are the NPP's for group A (left) and B (right):
Both of these are not completely straight. However, we will continue with our test and make a comment at the end that normality is not completely satisfied, so we proceed with caution.
2) Independent samples: each country is not in both groups
3) 10% rule (Independent selections) N ≥ 10n
We need to find the degrees of freedom: n1+n2-2
df = 25+25-2 = 50-2 = 48
Earlier, I mentioned a test statistic. This will allow us to standardize the informaiton we're given. Using the formula below, we will find our t-statistic.
We want to find our t-statistic from the following formula:
2) Independent samples: each country is not in both groups
3) 10% rule (Independent selections) N ≥ 10n
We need to find the degrees of freedom: n1+n2-2
df = 25+25-2 = 50-2 = 48
Earlier, I mentioned a test statistic. This will allow us to standardize the informaiton we're given. Using the formula below, we will find our t-statistic.
We want to find our t-statistic from the following formula:
Using the values that we know, we can compute what our test statistic would be. From there, we can plot this value on a normal bell curve and find the likelihood of receiving this test statistic. Go ahead and find the test statistic. Once you do this, it should be t = 5.253. Let's look at what this means on our curve. Below is a picture of a normal bell curve. Since we have standardized our value (t-statistic), we can center this at (0, 1) where 0 is our average and 1 is our standard deviation (symbol is sigma). Since our test statistic is 5.253, we would plot this at the appropriate location. For the picture below, our t-stat would be off the picture, but it would be to the right side somewhere between 5σ and 6σ.
The tricky thing about a 2-tailed test is that we must consider both the positive and negative version of our test-statistic. This means that we also need to consider when t= -5.253 and shade this in appropriately on our bell curve.
The tricky thing about a 2-tailed test is that we must consider both the positive and negative version of our test-statistic. This means that we also need to consider when t= -5.253 and shade this in appropriately on our bell curve.
Next, we want to find the likelihood of receiving a test statistic of this value. This likelihood or percentage is known as our p-value. We compare this value to alpha to determine if we can reject the null hypothesis or not. If p-value<alpha, then we reject the null hypothesis. If p-value>alpha, then we fail to reject the null hypothesis.
In your TI-84 calculator, click on "2nd>Vars" and choose 2) normalcdf
This option allows us to find the cumulative area between an lower and upper bound.
*cdf stands for "cumulative distribution function," and we are under a normal distribution.
This is the input data we want to type in. The lower bound is some large negative number, and our upper bound stops at 5.253. Enter mu and sigma appropriately since we are using a normal bell curve.
In your TI-84 calculator, click on "2nd>Vars" and choose 2) normalcdf
This option allows us to find the cumulative area between an lower and upper bound.
*cdf stands for "cumulative distribution function," and we are under a normal distribution.
This is the input data we want to type in. The lower bound is some large negative number, and our upper bound stops at 5.253. Enter mu and sigma appropriately since we are using a normal bell curve.
The output from our calculator is as follows. Since we are doing a two-tailed test, we must multiply our p-value by 2 because we are including both tails of our bell curve.
To find the correct percentage, we subtract this from 1. So, 1 - .9999 and multiply by two. This value is very small, and we know it will be smaller than our significant value of alpha = .05. Thus, we can reject our null hypothesis!
We can also use our T-Nspire calculator to do this test for us instead of by hand! Instead of shading in particular regions on a bell curve, we can rely on technology to perform our test for us. Usually in the beginning though, it's good to do these tests by hand so we know what our calculator is telling us.
Using any TI-83/84 calculator will also be sufficient.
On your calculator, go to the document toolbox and click on 6) Statistics > 7) Stat tests > 4) 2-Sample t Test. We want our data input method to be Stat.
Using any TI-83/84 calculator will also be sufficient.
On your calculator, go to the document toolbox and click on 6) Statistics > 7) Stat tests > 4) 2-Sample t Test. We want our data input method to be Stat.
You'll see this box and can fill out the appropriate values (left). Next, we want our test to read Ua ≠ Ub.
At the bottom, there is the option of having a pooled variance or not. What does this mean?
Pooling our variance allows us to be more accurate in our estimate of the variance of the individual countries in our sample. We also want to use the pooled variance if we think that the mean for each country is different. It is possible to assume that each country's variance is the same, but to be extra precise, we will let pooled be "yes."
At the bottom, there is the option of having a pooled variance or not. What does this mean?
Pooling our variance allows us to be more accurate in our estimate of the variance of the individual countries in our sample. We also want to use the pooled variance if we think that the mean for each country is different. It is possible to assume that each country's variance is the same, but to be extra precise, we will let pooled be "yes."
Looking at our results, we have:
Note that our p-value is .000003. This is very close to 0%. What is a p-value? A p-value is the percentage or likelihood of getting a t-statistic as extreme or more if our null hypothesis were true. That means that our t-statistic is very rare! There's almost a 0% chance of getting this t-statistic.
Since our p-value < alpha = .05, then we reject the Ho. We conclude that there is significant evidence that the two means from group A and group B are not equal.
What does this mean?
The two groups are based off of high median income and low median income. The test scores are significantly different for each group, so we conclude that median income may play a factor in determining a country's average math score on the PISA exam. Our correlation test from earlier also helps us in this conclusion.
Remember though that we can never say that one factor causes another. Thus, we cannot conclude that high median income increases the math score of 15 year olds. We can simply state that the two averages are significantly different from each other. This infers that there is some relationship between median income and math test scores. Further, we note that the normality condition was not satisfied when first conducting our test. Thus, the actual results may vary.
Since our p-value < alpha = .05, then we reject the Ho. We conclude that there is significant evidence that the two means from group A and group B are not equal.
What does this mean?
The two groups are based off of high median income and low median income. The test scores are significantly different for each group, so we conclude that median income may play a factor in determining a country's average math score on the PISA exam. Our correlation test from earlier also helps us in this conclusion.
Remember though that we can never say that one factor causes another. Thus, we cannot conclude that high median income increases the math score of 15 year olds. We can simply state that the two averages are significantly different from each other. This infers that there is some relationship between median income and math test scores. Further, we note that the normality condition was not satisfied when first conducting our test. Thus, the actual results may vary.
If you're ready to move onto the chi-squared test, click here.