# Plotting examples – Boxplots in R

Probably my favorite way to display data are boxplots. Boxplots are used if you want to display one numeric vector or when you have a categorical and a numeric variable, e.g. you are looking at reaction times cross different groups are frequencies across the sex and age. The advantage over other displays lies in the fact that boxplots show aspects of the underlying distribution and also allows statistical inferences directly from the display. Quick R offers a very nice introduction to boxplots and I highly recommend you have a look at the link.

The example I chose is very complex but you can easily adapt it to your needs and delete code which produces things you don’t want or need. In fact, like always with R, there are a lot of options that can specify – simply modify the code to match your needs.
But let’s start and set up the boxplots: In a first step, we are going to generate some data and set up a data frame called “df”:

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 ``` ```##################################################### ### R Script "Visualizations with R: Boxplot" ##################################################### ### START ##################################################### # Remove all lists from the current workspace rm(list=ls(all=T)) # set up fictitious data ES <- rnorm(100, 50, 10) HS <- rnorm(100, 50, 15) SS <- rnorm(100, 35, 5) duration <- c(ES, HS, SS) speakers <- c(rep("ES", 100), rep("HS", 100), rep("SS", 100)) df <- data.frame(speakers, duration) df[, 2] <- as.numeric(df[, 2]) # inspect data head(df)   # and this is what the first rows of the data frame look like:   #> speakers duration #>1 ES 58.58587 #>2 ES 45.10878 #>3 ES 70.49455 #>4 ES 51.82427 #>5 ES 51.55624 #>6 ES 57.09725   #####################################################```

In a next step, we are going to create the simplest boxplot possible (it doesn’t look very fancy yet, but we are going to customize it later on…)
The function we use to set up a boxplot is simply called “boxplot” and it takes the variables to be plotted and the data set as mandatory arguments.

 ```1 2 3 4 5 ``` ```##################################################### # set up a first simple box plot boxplot(duration ~ speakers, data = df)   #####################################################```

Here is our first (very hmm let’s say basic) boxplot: After haing created a first very simple boxplot, we are going to customize it and make it look much nicer.
To do so, we are going to make use of the inbuild arguments taht can be used to specify features of our boxplot. Something that is not really neccessary but which allows you to specify and customize axes is to not draw them at frist, but draw them separately from the plot – and this is exactly, what we are goign to do now:

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 ``` ```  ##################################################### # set up a nicer box plot boxplot(duration ~ speakers, data = df, # the data we want to display main = "", # you could specify a title here ylab = "Duration (ms)", # label of the y-axis ylim = c(0, 100), # label of the x-axis axes = F, # do not draw axes yet notch = T, # include notches col = c("lightgreen", "lightgrey", "lightblue")) # create boxplots with different colors   # now, we create the x-axis axis(1, # set up the x-axis (1 = x, 2 = y) at = 1:3, # we specify the locations where we want the tickmarks labels = c("", "", ""), # you could specify the text here lty = 1, # we define the linetype (1 = straight line) col = "black", # the tickmarks should be black las = .8) # the font size should be 80% of the normal size   # we now set up the y-axis axis(2, # set up y-axis at = c(0, 20, 40, 60, 80, 100), # create tick marks at the specified locations labels= c("0", "20", "40", "60", "80", "100"), #create text at the specified locations lty = 1, # we define the linetype (1 = straight line) col = "black", # the tickmarks should be black las = .8) # the font size should be 80% of the normal size   #####################################################```

Here is our customized boxplot: Now, we are goign to finish off our customized boxplot by including +-symbols at the location of the means and also add text which provides the values of the means for each group.

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 ``` ```  ##################################################### mtext(c("Group 1", "Group 2", "Group 3"), # create specified text side = 1, # put text along the x-axis line = 3, # place text at the 3rd line of the x-axis at = 1:3) # put text at location 1 to 3   text(1:3, c(as.vector(by(df\$duration, df\$speakers, mean)), as.vector(by(df\$duration, df\$speakers, mean)), as.vector(by(df\$duration, df\$speakers, mean))), "+")   text(1:3, c(-1.0, -1.0, -1.0, -1.0), cex = 0.85, labels = paste("mean\n", c(round(as.vector(by(df\$duration, df\$speakers, mean)), 2), round(as.vector(by(df\$duration, df\$speakers, mean)), 2), round(as.vector(by(df\$duration, df\$speakers, mean)), 2), sep = ""))) rug(jitter(df\$duration), side=4) grid() box()   ###############################################################```

Below is what the code produces. 