Section 4 Function: R Codes
4.2 Function: Mean (AM, GM, HM)
- Calculate AM, GM and HM of a numeric vector using the following formula.
Arithmetic Mean (AM)
\[ \large AM = (x_1 + x_2 + ... + x_n)/n = \frac{1}{n}\sum\limits_{i=1}^{n} x_{i}\]
Geometric Mean (GM)
\[ \large GM = \sqrt[n]{(x_1 x_2 ... x_n)} = \left( \prod \limits_{i=1}^{n} x_{i} \right) ^{\frac{1}{n}} \]
Harmonic Mean (HM)
\[ \large HM = \frac{n}{(\frac{1}{x_1} + \frac{1}{x_2} + ... + \frac{1}{x_n})} = \frac{1}{\frac{1}{n}\sum\limits_{i=1}^{n} \frac{1}{x_i}}\]
# Function
fnMean <- function(x, ...){
n1 <- length(x)
n2 <- sum(!is.na(x))
x.am <- sum(x, ...) / n2
# x.gm <- prod(x)^(1/n)
x.gm <- exp(sum(log(x[x > 0]), ...) / n2)
x.hm <- 1 / (sum(1/x, ...)/n2)
Mean <- list(N=n1, N_excl_NA=n2,
AM=x.am, GM=x.gm, HM=x.hm)
return(Mean)
}
# Test
X <- c(1:5,NA,6:10)
out <- fnMean(x=X, na.rm=TRUE)
out
str(out)4.3 Function: Variance & Standard Deviation
- Use the above data to calculate sample variance and standard deviation.
Sample Variance
\[ \large Var(x) = s_x^2 = \frac{1}{n-1}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2 \]
Sample Standard Deviation
\[ \large s_x = \sqrt{s_x^2} = \sqrt{Var(x)} \]
fnDisp <- function(x, ...){
n1 <- length(x)
n2 <- sum(!is.na(x))
x.am <- sum(x, ...) / n2
x.var <- sum((x - x.am)^2, ...) / (n2-1)
x.sd <- sqrt(x.var)
Disp <- list(N=n1, N_excl_NA=n2,
Var=x.var, SD=x.sd)
return(Disp)
}
# Test
X <- c(1:5,NA,6:10)
out <- fnDisp(x=X, na.rm=TRUE)
out
str(out)4.4 Function: Summary Statistics
- Use base R functions or your own custom functions, write a function that will return a vector summary statistics of the following location and dispersion estimates of a numeric :
- Number of observations
- Number of non-missing observations
- Minimum value (Min)
- Maximum value (Max)
- Arithmetic mean (AM)
- Geometric mean (GM)
- Harmonic mean (HM)
- First quartile (Q1)
- Second quartile or Median (Q2)
- Third quartile (Q3)
- Range
- Interquartile range (IQR)
- Variance (Var)
- Standard deviation (SD)
- Coefficient of variation (CV)
fnSummary <- function(x, ...){
n1 <- length(x)
n2 <- sum(!is.na(x))
x.am <- sum(x, ...) / n2
x.gm <- exp(sum(log(x[x > 0]), ...) / n2)
x.hm <- 1 / (sum(1/x, ...)/n2)
x.var <- sum((x - x.am)^2, ...) / (n2-1)
x.sd <- sqrt(x.var)
x.min <- min(x, ...)
x.max <- max(x, ...)
x.rng <- x.max - x.min
x.q <- quantile(x, probs=c(0.25,0.50,0.75), ...)
x.iqr <- unname(x.q[3] - x.q[1])
x.cv <- x.sd / x.am
Summary <- list(N=n1, N_excl_NA=n2,
Min=x.min, Max=x.max,
AM=x.am, GM=x.gm, HM=x.hm,
Q1=unname(x.q[1]), Q2=unname(x.q[2]), Q3=unname(x.q[3]),
Range=x.rng, IQR=x.iqr,
Var=x.var, SD=x.sd, CV=x.cv)
return(Summary)
}
# Test
X <- c(1:5,NA,6:10)
out <- fnSummary(x=X, na.rm=TRUE)
out
str(out)
unlist(out)