29. Mixed Hypothesis Testing Practice¶
from datascience import *
import numpy as np
%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')
from scipy import stats
An undergraduate statistics research project several years ago studied humor styles and personality. The data set from that study is called personality.csv
. We will pull different subsets of that data frame for the work below. Here’s a description of the different variables you will see.
- Sex
M/F response to question about biological sex
- G21
Y/N response to “are you 21 years old or older?”
- Greek
Y/N response to “are you involved in a social Greek fraternity or sorority?”
- AccDate
Y/N response to question: “At a time in your life when you are not involved with anyone, someone asks you out. This person has a great personality, but you do not find them physically attractive. Do you accept the date?”
- SitClass
Front/middle/back response to “where do you prefer to sit in class?”
- Friends
Same/opposite/either response to “which sex do you find it easiest to make friends with.”
- Stress1, Stress2
Pre-post measure of stress in the 2nd week (Stress1) and 7th week (Stress2) of the semester.
- TxRel
Toxic relationships beliefs, higher scores indicate more toxicity.
- Opt
Optimism, higher scores indicate more optimism.
- SE
Self-esteem, higher score indicate higher levels of self-esteem.
- Neuro
Neuroticism, higher scores indicate higher levels of neuroticism
- Perf
Perfectionism, higher scores indicate higher levels of perfectionism.
- Narc
Narcissism, higher scores indicate higher levels of narcissism.
You will likely recognize several data sets that we used in class examples and labs, too.
Data for examples¶
The large table personality.csv
is loaded below. We can use where
and select
to find appropriate sub-tables for analysis.
personality = Table.read_table('http://faculty.ung.edu/rsinn/personality.csv')
We should be ready and able to perform the following hypothesis tests and related statistical investigations:
Exploratory data analysis of single numeric and/or category variables
A/B Tests
Bootstrap confiedence interval estimates (typically of the mean)
Exploratory data analysis of two numeric variables (correlation)
Test for significant correlation