# Plotting Histograms with Density Plots in R (ggplot2)

Last week I needed to plot some distributions of means of proportions of correct answers of a experiment.  As all we hate bar charts,  we must favour plots that show the data variability. I decided to make a histogram with density plot and mean.

First I simulated a dataset in long format (each row is an observation), which are the ones that I regularly use
then I make the plot.

Update: I added a new version with the points jittered and a boxplot. Suitable when you have few participants (<30).

View it in GitHub as Gist

```

# Cleaning the session
rm(list=ls());gc()
# Useul function to load and install al needed packages
# we are reading it from my gist on GitHub.
source("https://raw.githubusercontent.com/guidocor/R_utils/master/install.R")
install_and_detach(c("dplyr","ggplot2", "doBy"),load= T, clean = T)

theme_set(theme_bw()) # My favourite ggplot2 theme : )
parts = 30 # Our participants
trials = 30 # Our Trials
conds = 3
# Simulate some data. There are better ways, but by the way this is a easy
# 30 participants in two conditions with binary responses
response <- c( rbinom(parts*trials, 1, .3) ,
rbinom(parts*trials, 1, .65),
rbinom(parts*trials, 1, .59)  )
condition <- c(rep(1, parts*trials), rep(2, parts*trials),rep(3, parts*trials))
participant <- sort(rep(1:(conds*parts), trials) )
# our data frame, ready
df <- data.frame(participant, condition, response)

# This is a trick to convert to factor (or numeric, etc) the data
df <- df[,c("participant", "condition", "response")]
to_factor <- c("participant", "condition")
df[,to_factor] <- lapply(df[,to_factor], factor)

# Factors are coded as 1, 2 and 3. We want they to be meaningful, so we must
# change them as follow:

levels(df\$condition) <- c("Condition 1", "Condition 2", "Condition 3")

# Two ways of do the same one easier with summaryBy and other with dplyr
# Both are useful and apropiatte. In this particular
# case i think doBy function summaryBy is better
means.v <- summaryBy(response ~ participant + condition, data = df) # First we have to group the data and then make a summary # (Don't be afraid of using the pipe ( %>%), if you are in Rstudio pres ctrl+shift+M)
means <- df %>% group_by(., condition, participant)  %>%  summarise(., m = mean(response))

# We need to store the means of each condition for the plot
m.data <- means %>% summarise(., global.mean = mean(m))

# And finally the plot!
means.plot<-ggplot(means, aes(m)) + # set the ggplot boject
geom_density(alpha=.5, fill="#FF6666" ) + # you can pay with the alpha and the fill
# add a histogram, adapt the binwidth to your data!
geom_histogram(colour="black", fill="white", alpha = 0.4, binwidth = 0.05)  +
# we want separate graphs for each condition, you can
# play with the number of columns with ncol!
facet_wrap(~condition, ncol = 1) +
ggtitle("Mean by each group") + # title
labs(x = "Mean of each participant",
y = "") + # labels in x
# Remember that we made a data frame wirh the mean of each condition?
# que are using it for plotting the mean in each density plot
geom_vline(data=m.data, aes(xintercept=global.mean),
linetype="dashed", size=1)

means.plot

# finally we can save the plot
ggsave(file  = "./distributions.png",  means.plot, height = 8, width = 4, dpi = 300)
# adjust the width, the height and the density per inch

# A version of the same data can be displayed as points on a boxplot.
# Suitable when you have few participants.
points <- ggplot(means, aes(m)) +
geom_point(data = means, aes(y = m, x = condition),
size = 3, alpha = 0.5, colour="#FF6666",
position = position_jitter(width = 0.6, height = 0.1)) +
geom_boxplot(data = means, aes(y=m, x=condition), alpha= 0.2 , fill = "#545454") # colour and alphas

# You can flip axis to get a more confortable display of results
points <- points + ggtitle("Mean by each group") +
labs(x = "", y = "Mean of each participant") + coord_flip()
points
# finally we can save the plot
ggsave(file  = "./points.png",  points, height = 4, width = 5, dpi = 300)
# adjust the width, the height and the density per inch
```

And the result!