Section 15 Function: Examples

15.1 Function: Odd number

  • Create a function is.odd that returns TRUE for odd number and FALSE for even number

15.2 Function: Power

  • Create a function to calculate power of elements of a numeric vector

15.3 Function: Mean (AM, GM, HM)

  • Calculate AM, GM and HM of a numeric vector using the following formula.

Arithmetic Mean (AM)

\[ \large AM = (x_1 + x_2 + ... + x_n)/n = \frac{1}{n}\sum\limits_{i=1}^{n} x_{i}\]

Geometric Mean (GM)

\[ \large GM = \sqrt[n]{(x_1 x_2 ... x_n)} = \left( \prod \limits_{i=1}^{n} x_{i} \right) ^{\frac{1}{n}} \]

Harmonic Mean (HM)

\[ \large HM = \frac{n}{(\frac{1}{x_1} + \frac{1}{x_2} + ... + \frac{1}{x_n})} = \frac{1}{\frac{1}{n}\sum\limits_{i=1}^{n} \frac{1}{x_i}}\]


15.4 Function: Variance & Standard Deviation

  • Use the above data to calculate sample variance and standard deviation.

Sample Variance

\[ \large Var(x) = s_x^2 = \frac{1}{n-1}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2 \]

Sample Standard Deviation

\[ \large s_x = \sqrt{s_x^2} = \sqrt{Var(x)} \]


15.5 Function: Summary Statistics

  • Use base R functions or your own custom functions, write a function that will return a vector summary statistics of the following location and dispersion estimates of a numeric :

  • Number of observations

  • Number of non-missing observations

  • Minimum value (Min)

  • Maximum value (Max)

  • Arithmetic mean (AM)

  • Geometric mean (GM)

  • Harmonic mean (HM)

  • First quartile (Q1)

  • Second quartile or Median (Q2)

  • Third quartile (Q3)

  • Range

  • Interquartile range (IQR)

  • Variance (Var)

  • Standard deviation (SD)

  • Coefficient of variation (CV)

  • You can use standard R functions in your own custom function.


15.6 Function: Correlation

  • Create your own function to estimate correlation between two numeric vector

  • Also check the output from your function with that of the cor function

\[ \large r_{xy} = \frac{Cov(x,y)}{\sqrt{(Var(x)Var(y)}} = \frac{Cov(x,y)}{s_xs_y} \]

\[ \large Cov(x,y) = s_{xy} = \frac{1}{n-1}\sum\limits_{i=1}^{n} (x_i-\bar{x})(y_i-\bar{y}) \]

\[ \large Var(x) = s_x^2 = \frac{1}{n-1}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2 \]

\[ \large Var(y) = s_y^2 = \frac{1}{n-1}\sum\limits_{i=1}^{n} (y_i-\bar{y})^2 \]


15.7 Function: Skewness and Kurtosis

  • Create a function to calculate Skewness and Kurtosis of a numeric vector

Skewness

\[ \large Skewness = \frac{m^3}{s^3} = \frac{\frac{1}{n}\sum\limits_{i=1}^{n}(x_i-\bar{x})^3} {[\frac{1}{n}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2]^{3/2}} \]

Kurtosis

\[ \large Kurtosis = \frac{m^4}{s^4} = \frac{\frac{1}{n}\sum\limits_{i=1}^{n} (x_i-\bar{x})^4} {[\frac{1}{n}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2]^{2}} \]

where \(\large \bar{x}\) is the sample mean, \(\large s\) is the sample standard deviation, and the numerator \(\large m^3\) is the sample third central moment and \(\large m^4\) is the sample fourth central moment.


15.8 Vectorised operation: t.test

  • Create your own function to conduct t-test that mimics the function t.test for simpler condition.

  • Return the output as a list

  • Compare the output with the t.test function

Steps:

Sample Mean: \[ \large \bar{x_1} = \frac{1}{n}\sum\limits_{i=1}^{n} x_{1i} \] \[ \large \bar{x_2} = \frac{1}{n}\sum\limits_{i=1}^{n} x_{2i} \]

Pooled Sample Standard Deviation: \[ \large s = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}\] Where: \(\large s_1^2\) and \(\large s_2^2\) are variances from Sample 1 and Sample 2, respectively.

Test statistic \(t_{Cal}\) under \(\large H_O: \mu_1 = \mu_2\):

\[ \large t_{Cal} = \frac{(\bar{x_1}-\bar{x_2})} {s\sqrt{1/n_1+1/n_2}} \]

Two-tailed probability under the distribution of the test statistic: \[\large 2*pt(q = |t_{Cal}|, df = n1+n2-2, lower.tail=FALSE)\]