Section 19 Single Continuous Variable: Density & Rug plot
19.1 Density plot
- A Density Plot visualises the distribution of data over a continuous interval.
- It is a variation of a Histogram that uses kernel smoothing to plot values, allowing for smoother distributions by smoothing out the noise.
- Density plot along with Rug plot adds additional information for the sample observations.
- Rug plot is generally used when the sample size is smaller.
19.2 package base
data(iris)
hist(x=iris$Sepal.Length, breaks=15,
xlim=c(4,8), freq=FALSE,
main='Histogram of Sepal Length',
xlab='Sepal Length (cm)',
ylab='Density',
axes=TRUE,
col='orange',
lty=1, border='purple')
lines(density(iris$Sepal.Length), col = 'red', lwd = 2, lty = 1)
rug(jitter(x=iris$Sepal.Length, amount = 0.01), side = 1, col = 'red')
19.3 package ggplot2
g <- ggplot(data=iris, mapping=aes(Sepal.Length))
g <- g + geom_histogram(mapping=aes(y=..density..),
binwidth=0.10, fill='white', colour='blue')
g <- g + geom_density(alpha=0.2, fill='orange', colour='purple')
g <- g + labs(title='Histogram & Density plot of Sepal Length',
subtitle='Based on Iris data',
x='Sepal Length (cm)',
y='Density')
g <- g + geom_rug()
g + theme_bw()