Comparison of Means: One Sample, Unknown Population SD

 

It may be the case where we do NOT know the standard deviation of the population.  In this case, we need to use the standard deviation of the sample (s) to estimate the standard deviation of the population (σ).  Because of this estimation, we need to use the t-distribution, instead of the z-distribution to make our comparison.  Following the same steps outlined in the one-sample comparison of means test described earlier, we will now conduct a one-sample t-test.

 

Do residents of Durham walk more than 10 miles a week on average?

 

As you can see from the question, this will be a one-sided, one sample test.  We will want to make a comparison about the population of Durham resident walking time to the value 10 miles (μo).  To do so, we set up the following hypotheses:

 

Ho:  µwalking ≤ 10 miles

Ha: µwalking > 10 miles

 

We randomly select 30 residents of Durham and give each a pedometer using the population of Durham City tax payers. We select a week in the fall to distribute the pedometers and ask the participants to wear the pedometer for a week.

Next, we should develop a histogram and summary statistics of the data.  One of the assumptions of the one-sample t-test is that the population of values (miles walked by Durham residents) are normally distributed.  Looking at the histogram, we can say that the sample distribution looks to be roughly normally distributed.  The t-tests are generally robust tests, unless the sample size is very small, there is skewness in the data or extreme outliers are present.  We also need to assume independence among observations.  Because we conducting a simple random sample, we will assume we met the independence assumption.  We will go ahead with the t-test assuming that the population of miles walked is normally distributed.

 

 

summarystats

 

mileswalked2

 

So now we can calculate our t-statistic.  As you will recall from the previous one-sample mean comparison, the test statistic equals the estimate minus the hypothesized value, all divided by the standard error.

ttest2

We can now plug in the numbers and calculate our t-statistic of 1.75, with 29 degrees of freedom (30-1).

 

tstat2

With the t-statistic of 1.75 and 29 degrees of freedom (n-1), we can calculate the p-value.  The p-value is the probability (t>1.75) = 0.0903.  This is the area under the curve greater than t=1.75 (29 df).  So,Given that our null hypothesis is true (that Durham residents walk less or equal to than 10 miles/week on average), the probability of getting the results we got, or more extreme is 0.09.  This provides weak/inclusive evidence against the null hypothesis. I would state that we have mildly suggestive, but inconclusive, evidence that Durham residents, on average, walk more than 10 miles a week.  We also need to be careful in how we generalize our results.  Our walkers were measured during the same week in the fall, and therefore, these results probably do not generalize to all of the weeks of the year.

tdist