Paired t-test

 

This tutorial will walk you through the use of the paired t-test.  The paired t-test should be used when the data are in pairs.  One of the most common uses of this test is a before- and after- (pre and post) analysis.  For example, we might be interested in how much a certain group grows after fertilizer is added to randomly selected plots.  Or, for example, a pre- and post- test given after an educational program to measure the learning from the program.  The ideal use of the paired test is when everything remains constant between the first and second measurement except for the effect of the factor you would like to measure (e.g., the effect of an educational seminar on learning).

 

pairedttest

 

Paired t-test Example

 

Imagine that you are director of a summer school environmental science program and you want to know the effect of the program on students’ environmental science knowledge. One way to measure and test this would be to giver a pre-test to the students prior to arriving to the start of the program and an exam after the program was complete. Prior to running the paired t-test we would want to set up the null and alternative hypotheses. Because we expect student knowledge to INCREASE from the pre- to the post-test, we will set up one-sided hypotheses.

 

Ho: the population mean of differences (post-pre) is less than or equal to zero

Ha: the population mean of differences (post-pre) is greater than zero

In our dataset in STATA we have one column of data with the pre-test data and one column of data for the post-test data. We can now run the paired t-test using STATA with the command: ttest posttest == pretest

After we run the test, STATA produces the following table:

PairedTestStataTable

We are most interested in the ‘diff’ (differences) line of the table. In that line, you can see that there are 20 differences (one for each student) and the mean of these differences is 12.8. Below the table you can see that STATA is taking the post-test minus the pre-test, which is what we want–you always should check the order of subtraction in STATA). Next to the mean is the standard deviation of the differences (approximately 2.746). STATA calculates the t-statistic using the equation above and gets 4.66, with 19 degrees of freedom (# of pairs minus 1). The p-value associated with the appropriate one-sided test is 0.0001 (the alternative hypothesis on the far right). With this p-value, we can reject our null hypothesis and assert that the population mean of differences  (post-test minus pre-test) is greater than zero. STATA also calculates the (two-sided) confidence interval around the mean difference (7.05, 18.5).

Assumptions of the Paired t-test

 

1. The population of differences is normally distributed

2. The pairs are independent.

Like all statistical tests, we need to check the validity of the assumptions behind the paired t-test. To test the normality assumption, we could create a box plot of the differences (or histogram). If we selected the students using a simple random sample, we hope that the pairs are independent.

 

 

Comments are closed.