Helpful Code#

When using R, several tasks are used quite often without a detailed explanation given for each use. We therefore have created a reference page for the most common uses.

Extracting Columns from Data Frames as Vectors#

Generally, a column in a data frame contains the values for a specific variable. Thus, we often wish to extract a column from a data frame as a vector of values so that we can work with it.

pers <- read.csv('https://faculty.ung.edu/rsinn/data/personality.csv')

Option 1: $#

To extract the perfectionism scores column of data using the dollar sign method, we proceed as follows:

perfect <- pers$Perf
head(perfect, 5)
  1. 105
  2. 105
  3. 73
  4. 90
  5. 95

Option 2: [Row, Column] Format#

To extract the perfectionism scores column of data using the Rows and columns of the data frame, we leave the row indicator empty and specific a column as shown:

perfect2 <- pers[ , 'Perf']
head(perfect2, 5)
  1. 105
  2. 105
  3. 73
  4. 90
  5. 95

The column may described either by number (shown below) or by name (as shown above). The perfectionism scores are stored in column #27.

perfect3 <- pers[ , 27]
head(perfect3, 5)
  1. 105
  2. 105
  3. 73
  4. 90
  5. 95

All methods shown work properly and, as one can see, display identical results.

Subsetting a Data Frame#

What if we wish to compare the biological sexes on the narcissism variable? Then, we need to create a subset of narcissism for both sexes, male and female. Working with the females first, we see the following:

females <- subset(pers, Sex == 'F')
head(females, 5)
AgeYrSexG21CorpsResGreekVarsAthHonorGPA...PerfOCDPlayExtroNarcHSAFHSSEHSAGHSSDPHS
220 3 F N N 2 Y N Y 3.95... 105 3 172 16 11 46 52 26 33 SE
427 3 F Y N 3 N N N 2.84... 90 9 160 16 10 51 51 23 19 SE
622 3 F Y N 2 Y N N 2.63... 114 20 133 10 9 40 27 31 28 AG
820 3 F N N 1 Y N N 3.30... 142 17 168 16 9 55 45 24 29 AF
922 2 F Y N 1 N N N 3.02... 119 16 141 10 9 52 47 32 26 SE

Grid of Graphics#

We often wish to show 2 or more graphical displays for a specific data set while minimizing the space required to do so. We will use two functions to assist us:

  1. layout()

  2. matrix()

Warning

We use the option lcm() to specify the height of the graphical display in centimeters. Values between 5 and 12 generally work well, and some guesswork is typically required.

We create a plot called plt because it’s made up of 2 different graphical pieces: the qqnorm plot and qqline superimposed on top. Since we wish to display these 2 elements together, we surround them with { } and store them as the single graphical item plt.

For an example, let’s display a histogram and a QQ plot for the naricissism variable of the personality data frame:

data <- pers[ , 'Narc']

layout(matrix(c(1,2), ncol = 2), lcm(8))
hist(data)
plt <- { qqnorm(data, main = 'QQ Plot: Narcissism') ; qqline(data) }
_images/e2ac007ea1e3dfe7fa85b64fa530a39d672b3969eb1cca2d9a7b365e2db26530.png