<< Chapter < Page Chapter >> Page >
This module provides an overview of Comparing Two Independent Population Means with Unknown Population Standard Deviations as a part of Collaborative Statistics collection (col10522) by Barbara Illowsky and Susan Dean.

Assumptions and conditions: two sample means test

When constructing a two sample mean hypothesis test the assumptions and conditions must be met in order to use the t-distribution model.

  • Randomization Condition: The data must be sampled randomly. Is one of the good sampling methodologies discussed in the Sampling and Data chapter being used?
  • Independence Assumption: The sample values must be independent of each other. This means that the occurrence of one event has no influence on the next event. Usually, if we know that people or items were selected randomly we can assume that the independence assumption is met.
  • 10% Condition: When the sample is drawn without replacement (usually the case), the sample size, n, should be no more than 10% of the population. This must be true for both groups.
  • Nearly Normal Condition: This must be met for both groups. To check the nearly normal condition start by making a histogram or stemplot of the data for both groups, it is a good idea to make outlier boxplots, too. If the samples are small, less than 15 then the data must be normally distributed. If the sample size is moderate, between 15 and 40, then a little skewing in the data will can be tolerated. With large sample sizes, more than 40, we are concerned about multiple peaks (modes) in the data and outliers. When discussing nearly normal you should report the sample size, shape and outlier for each group.
  • Independent Groups: The two groups you are working with must be independent of one another. Is there reason to believe that one group would have influence on the other group? A hypothesis test that is comparing a group of respondent’s pre-test and post-test scores would not be independent. Likewise, groups made by splitting married couples into husband and wife groups would not be independent.
The test comparing two independent population means with unknown and possibly unequal population standard deviations is called the Aspin-Welch t-test. The degrees of freedom formula was developed by Aspin-Welch.

The comparison of two population means is very common. A difference between the two samples depends on both the means and the standard deviations. Verydifferent means can occur by chance if there is great variation among the individual samples. In order to account for the variation, we take the difference of the samplemeans, X 1 - X 2 , and divide by the standard error (shown below) in order to standardize the difference. The result is a t-score test statistic (shown below).

Because we do not know the population standard deviations, we estimate them using the two sample standard deviations from our independent samples. For thehypothesis test, we calculate the estimated standard deviation, or standard error , of the difference in sample means , X 1 - X 2 .

The standard error is:

( S 1 ) 2 n 1 + ( S 2 ) 2 n 2

Practice Key Terms 3

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Collaborative statistics using spreadsheets. OpenStax CNX. Jan 05, 2016 Download for free at http://legacy.cnx.org/content/col11521/1.23
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Collaborative statistics using spreadsheets' conversation and receive update notifications?

Ask