Goodness of Fit (GOF)

Goodness of Fit (GOF)#

The \(\chi^2\) test can be used in a rather novel way:

We can test a probability model.

The \(\chi^2\) GOF allows us to compare observed data to a given probability model. Perhaps the best example is that of eye color.

Example: Eye Color#

A recent release from the American Academy of Opthamology gives the proportion of various eye colors in the population of the United States.

	Blue	Green	Hazel	Light Brown	Dark Brown
Proportion	32%	15%	12%	16%	25%

At UNG, recent class surveys resulted in the following sample eye color distribution from students at the university which are shown the obs vector below:

obs <- c(68,41,30,51,60)
prob <- c(0.32, 0.15, 0.12, 0.16, 0.25)

chisq.test(obs, p = prob)

	Chi-squared test for given probabilities

data:  obs
X-squared = 5.2517, df = 4, p-value = 0.2624

Reporting Out#

Given that \(p = 0.2624 > 0.05 = \alpha\), we fail to reject the null. We have no evidence that the observed data on eye color from UNG students departs from the nationwide probability.

Example: Using Tables and Formulas#

We have the observed data vector above. We need to calculate the expected vector which is based on probabilities.

Observed Data Vector and Expected Vector#

Starting with the observed data vector (shown above), we need to know the sample size which we can find with a summation of the vector obs:

sum(obs)

250

We compute the predicted value for the number of students expected to have each eye color by multiplying the probabilities from the model by the total sample size.

Blue: \(32\%\) of \(250 = 80\)
Green: \(15\%\) of \(250 = 37.5\)
Hazel: \(12\%\) of \(250 = 30\)
Light Brown: \(16\%\) of \(250 = 40\)
Dark Brown: \(25\%\) of \(250 = 62.5\)

We can calcuate these values in R as shown below:

exp <- prob * 250
exp

80
37.5
30
40
62.5

Gathering it all together, we have the following matrix:

tab = matrix(c(obs, exp), nrow = 2, byrow = TRUE)
rownames(tab) <- c('Observed', 'Expected')
tab

Observed	68	41.0	30	51	60.0
Expected	80	37.5	30	40	62.5

Calculating the Test Statistic \(\chi^2\)#

Referring to the formula sheet provides the following:

\[\chi^2 = \sum \frac{(O−E)^2}{E}\]

We enter the data into the formula:

\[\begin{split}\begin{align}\chi^2 &= \frac{(68−80)^2}{80} + \frac{(41−37.5)^2}{37.5} + \frac{(30−30)^2}{30} + \frac{(51−40)^2}{40} + \frac{(60−62.5)^2}{62.5}\\&= \frac{1.8}{80} + \frac{12.25}{37.5} + 0 + \frac{121}{40} + \frac{6.25}{62.5}\\&\approx 1.80 + 0.33 + 0.00 + 3.03 + 0.1\\&\approx 5.25\end{align}\end{split}\]

Finding \({\chi^2}^*\) in the Table#

From the class \(\chi^2\) table using \(df = \text{number of probabilities} - 1 = 4\) and \(\alpha = 0.05\), we find that:

\[{\chi^2}^* = 9.488\]

Reporting Out#

Given that \(\chi^2 = 5.25 < 9.488 = {\chi^2}^*\), we fail to reject the null. We have no evidence that the observed data on eye color from 250 UNG students departs from the nationwide probability.

Goodness of Fit (GOF)

Contents

Goodness of Fit (GOF)#

Example: Eye Color#

Reporting Out#

Example: Using Tables and Formulas#

Observed Data Vector and Expected Vector#

Calculating the Test Statistic \(\chi^2\)#

Finding \({\chi^2}^*\) in the Table#

Reporting Out#