BUS308 Week 3 Lecture 2
Examining Differences – ANOVA and Chi Square
Expected Outcomes
After reading this lecture, the student should be familiar with:
1. Conducting hypothesis tests with the ANVOA and Chi Square tests 2. How to interpret the Analysis of Variance test output 3. How to interpret Determining significant differences between group means 4. The basics of the Chi Square Distribution.
Overview
This week we introduced the ANOVA test for multiple mean equality and the Chi Square tests for distributions. This lecture will focus on interpreting the outcomes of both tests. The process of setting them up will be covered in Lecture 3 for this week.
ANOVA
Hypothesis Test
The week 3 question 1 asks if the average salary per grade is equal? While this might seem like a no-brainer (we expect each grade to have higher average salaries), we need to test all assumed relationships. This is much like our detectives saying “we need to exclude you from the suspect pool; where were you last night?” This example will, of course use the compa-ratio instead of the salary values you will use in the homework.
The ANOVA test is found in the Data | Analysis tab.
Step 5 in the hypothesis testing process asks us to “Perform the test.” Here is a screen shot of the ANOVA output for a test of the null hypothesis: “All grade compa-ratio means are equal.” For this question we will be using the ANOVA-Single Factor option as we are testing mean equality for a single factor, Grades. We will briefly cover the other ANOVA options in Lecture 3 for this week.
Note that The ANOVA single factor output includes the test name, a summary table, and an ANOVA table. The summary table that gives us the count, sum, average, and variance for the compa-ratios by the analysis groups (in this case our grades). Note that we are assuming equal variances within the grades within the population for this example, and your assignment. This may not actually be true for this example (note the values in the Variance column), but we will ignore this for now. ANOVA is somewhat robust around violations on the variance equality assumption – means it may still produce acceptable results with unequal variances. There is a non-parametric alternate if the variances are too different, but we do not cover it in this course. Please note that the column and row values are present in this screenshot. These will be needed as references in question 2.
The next table is the meat of the test. While for all practical purposes, we are only interested in the highlighted p-value, knowing what the other values are is helpful. When we introduced ANVOA in lecture 1, we discussed the between and within groups variation. As you recall, the between groups focused on the data set as a single group and not distinct groups. For the Between Groups row, we have an Sum of Squares (SS) value, which is a raw estimate of the variation that exists. The degrees of freedom (df) for Between Groups equals the number of groups (k) we have minus 1 (k-1), which equals 5 for our 6 groups. The Mean Square variation estimate equals the SS divided by the df.
The Within Group focuses on the average variation for all our groups. SS gives us the same raw estimate as for the BG row. The df for Within Groups is the total count (N) minus the number of groups (N-k), or 44 for our 50 employees in the 6 groups. MSwg equals SS/df.
The F statistic is calculated by dividing the MSbg by MSwg. The next column gives us our p-value followed by the critical value of F (when the p-value would be exactly 0.05). The total line is the sum of the SS values and the overall df which equal the total count -1 (N – 1).
(As with the t and F tests, we could make our decision by comparing the calculated F value (in cell O20, with critical value of F in cell Q20. We reject the null when the calculated F is greater than the critical F. The critical value of F or any statistic in an Excel output table is the value that exactly provides a p-value equaling our selected value for Alpha. However, we will continue to use the P-value in our decisions.)
Now that we have our test results, we can complete step 6 of the hypothesis testing procedure.
Step 6: Conclusions and Interpretation
What is the p-value? Found in the table, it is 0.0186 (rounded).
(Side note: at times Excel will produce a p-value that looks something like 3.8E-14. This is called the scientific or exponential format. It is the same as writing 3.8 * 10-14 and equals 0.000000000000038. A simple way of knowing how many 0s go between the decimal point and the first non-zero number is to subtract 1 from the E value, so with E- 14, we have 13 zeros. At any rate, any Excel p-value using E-xx format will always be less than 0.05.)
Decision: Reject the null hypothesis.
Why? P-value is less than 0.05.
Conclusion: at least one mean differs across the grades.
Question 2: Group Comparisons
Now that we know at least one grade compa-ratio mean is not equal to the rest, we need to determine which mean(s) differ. We do this by creating ranges of the possible difference in the population mean values. Remember, that our sample results are only a close approximation of the actual population mean. We can estimate the range of values that the population mean actually equals (remember that discussion of the sampling distribution of the mean from last week). So, using the variation that exists in our groups, we estimate the range of differences between means (the possible outcomes of subtracting one mean from another).
The following screen shot shows a completed comparison table for the grade related compa-ratio means.
Let’s look at what this table tells us before focusing on how to develop the values (covered in Lecture 3 for this week). Looking at the Groups Compared Column, we see the comparison groups listed, A-B for grades A and B, A-C for grades A and C, etc. The next column is the difference between the average compa-ratio values for each pair of grades. The T value column is the value for a 95% two tail test for the degrees of freedom we have. (Lecture 3 discusses how to identify the correct value). Note that it is the same value for all of our comparison groups, the explanation comes in Lecture 3.
The next column, labeled the +/- term, is the margin of error that exists for the mean difference being examined. This is a function of sampling error that exists within each sample mean. These are all of the values we need to create a range of values that represent, with a 95% confidence, what the actual population mean differences are likely to be. We subtract this value from the mean (in column B) to get our low-end estimate (Low column values), and we add it to the mean to get our high-end estimate (High column values).
Now, we need to decide which of these ranges indicates a significantly different pair of means (within the population) and which ranges indicate the likelihood of equal population means (non-significant differences). This is fairly simple, if the range contains a 0 (that is, one endpoint is negative and the other is positive), then the difference is not significant (since a mean difference of 0 would never be significant). Notice in the table, that the A-B, A-C, and A-D range all contain 0, and the results are not significant different. The A-E and A-F comparisons, however have positive values for each end, and do not contain 0; these means are different in the population.
We now know how to interpret an ANOVA table and an accompanying table of differences for significant mean differences between and among groups.
Chi Square Tests
With the Chi Square tests, we are going to move from looking at population parameters, such as means and standard deviations, and move to looking at patterns or distributions. The
shape or distribution of variables is often an important way of identifying differences that could be important. For example, we already suspect that males and females are not distributed across the grades in a similar manner. We will confirm or refute this idea in the weekly assignment.
Generally, when looking at distributions and patterns we can create groups within our variable of interest. For example, the Grades variable is already divided into 6 groups, making it easy to count how many employees exist in each group. But what about a continuous variable such as Compa-ratio, where no such clear division into separate groups exists. This is not a problem as we can always divide any range of values into groups such as quartiles (4 groups) or any other number of distinct ranges. Most variables can be subdivided this way.
The Chi Square test is actually a group of comparisons that depend upon the size of the table the data is displayed in. We will examine different tables and tests in Lecture 3, for this lecture we want to focus on how to interpret the outcome of a Chi Square test – as outcomes are the same regardless of the table size. The details of setting up the data will be covered in Lecture 3.
Example – Question 3
The third question for this week asks about employee grade distribution. We are concerned here about the possible impact of an uneven distribution of males and females in grades and how this might impact average salaries. While we are concerned about an uneven distribution, our null hypothesis is always about equality, so the null would respond to a question such as are males and females distributed across the grades in a similar pattern; that is, we are either males or females more likely to be in some grades rather than others.
A similar question can be asked about degrees, are graduate and undergraduate degrees distributed across grades in a similar pattern? If not, this might be part of the cause for unequal salary averages.
The step 5 output for a Chi Square test is very simple, it is the p-value, the probability of getting a chi square value as large or larger than what we see if the null hypothesis is true. That’s it – the data is set up, the Chi Square test function is selected from the Fx statistical list, and we have the p-value. There is not output table to examine.
So, for an examination of are degrees distributed across grades in a similar manner, we would have an actual distribution table (counts of what exists) looking like this:
Place the actual distribution in the table below. A B C D E F Total
UnderG 7 5 3 2 5 3 25 Grad 8 2 2 3 7 3 25 Total 15 7 5 5 12 6 50
This table would be compared to an expected table where we show what we expect if the null hypothesis was correct. (Setting up this table is discussed in Lecture 3.) Then we just get our answer.
So, steps 5 and 6 would look like:
Step 5: Conduct the test. 0.85 (the Chi Square p-value from the Chisq.Test function
Step 6: Conclusion and Interpretation
What is the p-value? 0.85
Decision on rejecting the null: Do Not Reject the null hypothesis.
Why? P-value is > 0.05.
Conclusion on impact of degrees? Degrees are distributed equally across the grades and do not seem to have any correlation with grades. This suggests they are not an important factor in explaining differing salary averages among grades.
Of course, a bit more of getting the Chi Square result depends on the data set up than with the other tests, but the overall interpretation is quite similar – does the p-value indicate we should reject or not reject the null hypothesis claim as a description of the population?
Summary
Both the ANOVA and Chi Square tests follow the same basic logic developed last week with the F and t-tests. The analysis is started with developing the first four (4) hypothesis testing steps which set-up the purpose and decision-making rules for the analysis.
Running the tests (step 5) will be covered in the third lecture for this week.
Step 6 (Interpretation) is also done in the same fashion as last week. Look for the p-value for each test and compare it to the alpha criteria. If the p-value is less than alpha, we reject the null hypothesis.
When the null is rejected in the ANOVA test, we then create difference intervals to determine which pair of means differs. If any of these intervals contains the value 0 (meaning one end is a negative value and the other is a positive value), we can say that those means are not significantly different within the population.
The Chi Square has two tests that were presented. One test looks at a single group compared to an expected distribution, which we provide. The other version compares two or more groups to an expected distribution which is generated by the existing distributions. How these “expected” tables are generated will be discussed in Lecture 3 for this week.
Please ask your instructor if you have any questions about this material.
When you have finished with this lecture, please respond to Discussion thread 2 for this week with your initial response and responses to others over a couple of days.