Helpful Code#

The very basics of R code are demonstrated below:

  1. Importing data from a URL

  2. Calculating the Standard Descriptives

  3. Calculating the 5-Number Summary

  4. Using the cat() function to print text and code output together with some formatting options.

Importing Data from a URL#

Many data sets are available on the internet in CSV format. The read.csv() function is very useful:

  1. A URL is its input.

  2. From the URL, R downloads the CSV file.

  3. From the CSV, R imports the file as a data frame.

A typical example is shown below:

pers <- read.csv('https://faculty.ung.edu/rsinn/data/personality.csv')
head(pers,3)
AgeYrSexG21CorpsResGreekVarsAthHonorGPA...PerfOCDPlayExtroNarcHSAFHSSEHSAGHSSDPHS
21 2 M Y Y 1 N N N 3.23... 105 10 142 8 11 41 40 26 27 SE
20 3 F N N 2 Y N Y 3.95... 105 3 172 16 11 46 52 26 33 SE
22 3 M Y N 2 N N N 3.06... 73 1 134 15 11 48 42 44 29 AG

To work with examples below, let’s grab the age column as a single vector of numeric data.

age <- pers$Age

Standard Descriptives#

The three most valuable statistics for nearly any data set are the mean, standard deviation and sample size. The functions we need are intuitively named:

  1. mean()

  2. sd()

  3. length()

m <- mean(age) ; s <- sd(age) ; n <- length(age)

5-Number Summary#

The 5-Number Summary of a numeric vector includes the min, Q1, med, Q3, and max values where Q1 and Q3 are the 25th percentile and 75th percentile respectively.

summary(age)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  17.00   19.00   20.00   20.81   21.00   50.00 

Cat() Function#

We use the cat() function to combine printed output with code output. We can format some text and the values found above:

cat('Standard descriptives for Age variable \nMean = ', m, '\nStd Dev = ', s, '\nSample Size = ',n)
Standard descriptives for Age variable 
Mean =  20.81395 
Std Dev =  3.639556 
Sample Size =  129

Formatting with the Cat() Function#

We can round the mean and standard deviation for readability, and we can include all the necessary code with the cat() function itself.

Tip

Long Coding Lines Long lines of code can be seperated by hard returns. R ignores most returns and spaces. Be sure to indent the same amount of spaces for each continuation line.

cat('Standard descriptives for Age variable 
          \nMean = ', round(mean(age),2), 
         '\nStd Dev = ', round(sd(age),3), 
         '\nSample Size = ',n)
Standard descriptives for Age variable 
          
Mean =  20.81 
Std Dev =  3.64 
Sample Size =  129

Finally, let’s also print out the 5-Number Summary below the standard descriptives. Please notice that the summary performs awkwardly inside the cat() function. Still, we can format our output nicely in spite of this. The code is shown below:

cat('Standard descriptives for Age variable 
         \nMean = ', round(mean(age),2), 
         '\nStd Dev = ', round(sd(age),3), 
         '\nSample Size = ',n,
         '\n\nThe 5-Number Summary for Age variable\n\n')
summary(age)
Standard descriptives for Age variable 
         
Mean =  20.81 
Std Dev =  3.64 
Sample Size =  129 

The 5-Number Summary for Age variable
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  17.00   19.00   20.00   20.81   21.00   50.00