<< Chapter < Page | Chapter >> Page > |
Tests of independence involve using a contingency table of observed (data) values. You first saw a contingency table when you studied bivariate descriptive statistics in the Bivariate Descriptive Statistics chapter.
The test statistic for a test of independence is:
where:
There are $i\cdot j$ terms of the form $\frac{(O-E{)}^{2}}{E}$ .
The Chi-square test of independence determine if there is a relationship between 2 categorical variables. Remember that in the chapter on bivariate data we examined two-way tables (pivot tables) for a relationship by examining either the row or column percentages. We will now test this relationship between categorical variables by calculating a test statistics and determining a p-value.
The null hypothesis for a chi-square test of independence is that there is no relationship between the two categorical variables or that they are independent.
The alternative hypothesis is that there is some kind of relationship between the two categorical variables or that they are dependent.
Before we test the hypothesis we need to check the assumptions and conditions for the chi-square test of independence.
The new assumptions for this test are the counted data condition and the expected cell frequency condition. The counted data condition is checking to see if we have counts of respondents categorized on two categorical variables. The numbers in each cell of the two-way table should be whole numbers showing how many people gave that combination of responses to the two categorical questions.
The other new condition, the expected cell frequency condition, is asking us to find how many people we would expect to be in each cell if the null hypothesis is true and there is no relationship between the two variables. To do this we will need to calculate the expected value for each cell in the two-way table. All of the excepted values must be larger than 5.
To find the expected values we will need the row and column totals and the overall sample size. The mathematics for the expected values is;
Once all the expected values are calculated check to make sure they are all larger than 5.
Your next step before calculating the test statistics is to calculate the row or column percentages, which is most appropriate. We discussed this in the bivariate data chapter. Remember to determine the dominate variable (look at the research question) and then make the percentage based on if the dominate variable is making rows or columns.
Like the other hypothesis tests we have looked at in this text we will calculate a test statistic. The formula for the chi-square test for independence is:
Notification Switch
Would you like to follow the 'Collaborative statistics using spreadsheets' conversation and receive update notifications?