Goodness of Fit (GOF)#
The \(\chi^2\) test can be used in a rather novel way:
The \(\chi^2\) GOF allows us to compare observed data to a given probability model. Perhaps the best example is that of eye color.
Example: Eye Color#
A recent release from the American Academy of Opthamology gives the proportion of various eye colors in the population of the United States.
Blue | Green | Hazel | Light Brown | Dark Brown | |
---|---|---|---|---|---|
Proportion | 32% | 15% | 12% | 16% | 25% |
At UNG, recent class surveys resulted in the following sample eye color distribution from students at the university which are shown the obs vector below:
obs <- c(68,41,30,51,60)
prob <- c(0.32, 0.15, 0.12, 0.16, 0.25)
chisq.test(obs, p = prob)
Chi-squared test for given probabilities
data: obs
X-squared = 5.2517, df = 4, p-value = 0.2624
Reporting Out#
Given that \(p = 0.2624 > 0.05 = \alpha\), we fail to reject the null. We have no evidence that the observed data on eye color from UNG students departs from the nationwide probability.
Example: Using Tables and Formulas#
We have the observed data vector above. We need to calculate the expected vector which is based on probabilities.
Observed Data Vector and Expected Vector#
Starting with the observed data vector (shown above), we need to know the sample size which we can find with a summation of the vector obs:
sum(obs)
We compute the predicted value for the number of students expected to have each eye color by multiplying the probabilities from the model by the total sample size.
Blue: \(32\%\) of \(250 = 80\)
Green: \(15\%\) of \(250 = 37.5\)
Hazel: \(12\%\) of \(250 = 30\)
Light Brown: \(16\%\) of \(250 = 40\)
Dark Brown: \(25\%\) of \(250 = 62.5\)
We can calcuate these values in R as shown below:
exp <- prob * 250
exp
- 80
- 37.5
- 30
- 40
- 62.5
Gathering it all together, we have the following matrix:
tab = matrix(c(obs, exp), nrow = 2, byrow = TRUE)
rownames(tab) <- c('Observed', 'Expected')
tab
Observed | 68 | 41.0 | 30 | 51 | 60.0 |
---|---|---|---|---|---|
Expected | 80 | 37.5 | 30 | 40 | 62.5 |
Calculating the Test Statistic \(\chi^2\)#
Referring to the formula sheet provides the following:
We enter the data into the formula:
Finding \({\chi^2}^*\) in the Table#
From the class \(\chi^2\) table using \(df = \text{number of probabilities} - 1 = 4\) and \(\alpha = 0.05\), we find that:
Reporting Out#
Given that \(\chi^2 = 5.25 < 9.488 = {\chi^2}^*\), we fail to reject the null. We have no evidence that the observed data on eye color from 250 UNG students departs from the nationwide probability.