Section 16 Single Variable: Summary Statistics
16.1 Descriptive statistics
Function | Explanation |
---|---|
length | Number of elements in a vector |
sum | Sum of the values in a vector |
min | Minimum of a vector |
max | Maximum of a vector |
range | Range (min, max) of a vector |
mean | Mean of the values in a vector |
median | Median of the values in a vector |
var | Variance |
sd | Standard deviation |
cov | Covariance of two vectors |
cor | Pearson correlation between two vectors |
16.2 Measures of average
- Mean
- Median
- Mode
- Quartiles
- Median Absolute Deviation (MAD)
x <- c(0, 2, 4, 6, NA, 8, 4, 5, 15, 11, 4, 7)
mean(x, na.rm=TRUE)
median(x, na.rm=TRUE)
Mode <- function(x) {
ux <- unique(x)
# tab <- ux[which.max(tabulate(match(x, ux)))]
tab <- tabulate(match(x, ux))
modex <- ux[tab == max(tab)]
return(modex)
}
Mode(x)
quantile(x=x, probs=c(0.25,0.50,0.75), na.rm=TRUE)
mad(x, na.rm=TRUE)
fivenum(x, na.rm = TRUE)
summary(x)
16.3 Measures of dispersion
- Range
- Inter-quartile range
- Standard Deviation
- Coefficient of variation
x <- c(0, 2, 4, 6, NA, 8, 4, 5, 15, 11, 4, 7)
range(x, na.rm=TRUE)
diff(range(x, na.rm=TRUE))
var(x, na.rm=TRUE)
sd(x, na.rm=TRUE)
IQR(x=x, na.rm=TRUE)
qx <- quantile(x=x, probs=c(0.25,0.50,0.75), na.rm=TRUE)
str(qx)
unname(qx[3] - qx[1])
# Coefficient of variation
cvx <- sd(x, na.rm=TRUE) / mean(x, na.rm=TRUE)
16.4 Measures of shape
- Skewness
- Kurtosis
x <- c(0, 2, 4, 6, NA, 8, 4, 5, 15, 11, 4, 7)
x <- x[!is.na(x)]
n <- length(x)
# Skewness
skx <- (sum((x - mean(x))^3)/n)/((sum((x - mean(x))^2)/n)^(3/2))
# Kurtosis
kurtx <- (sum((x - mean(x))^4)/n)/((sum((x - mean(x))^2)/n)^2)
16.5 Exercise
Calculate the following summary statistics of temperature and radiation of the weather data
- Mean, Median, Quartiles
- Range, IQR, Variance, SD, CV
- Skewness, Kurtosis
Write a function which will return all the above summary statistics of a vector
Calculate the standard deviation using the formula and compare with the R function output