Statistical Formulas#
In 9th grade algebra, we learn about equations such as
We find that:
\(x\) is the independent variable, and
\(y\) is the dependent variable.
In statistics, we have similar equations that we use to indicate what operations are being done, such as:
where \(y\) is the dependent variable. The meanings are quite similar. However, the operators we use for statistics forumalae are important and somewhat different than algebra.
ANOVA and \(t\)-Tests#
Suppose that we have a numeric variable \(\textbf{y}\) and a grouping variable \(\textbf{A}\). The statistical formula indicating that we should create an ANOVA or a \(\textbf{t}\)-test as appropriate is the following:
If the category variable \(A\) has 2 or fewer groups, R will conduct the \(t\)-test. If \(A\) has 3 or more groups, an ANOVA is launched which, admittedly, takes a couple more steps to complete.
Linear Regression#
If we have two numeric variables \(x\) and \(y\), the formula
indicates simple linear bivariate regression.
Forcing an Intercept#
We can specify the \(y\)-intercept of 2 in a linear model as follows:
Polynomial Regression#
The following code will produce quadractic regression:
or cubic regression:
Note that these formulas will be intrepretted by R as a request for a model using orthogonal polynomials. We can specify a model that using the specific (and traditional) powers as follows: