Section 15 Function: Examples
15.1 Function: Odd number
- Create a function
is.odd
that returns TRUE for odd number and FALSE for even number
15.3 Function: Mean (AM, GM, HM)
- Calculate AM, GM and HM of a numeric vector using the following formula.
Arithmetic Mean (AM)
\[ \large AM = (x_1 + x_2 + ... + x_n)/n = \frac{1}{n}\sum\limits_{i=1}^{n} x_{i}\]
Geometric Mean (GM)
\[ \large GM = \sqrt[n]{(x_1 x_2 ... x_n)} = \left( \prod \limits_{i=1}^{n} x_{i} \right) ^{\frac{1}{n}} \]
Harmonic Mean (HM)
\[ \large HM = \frac{n}{(\frac{1}{x_1} + \frac{1}{x_2} + ... + \frac{1}{x_n})} = \frac{1}{\frac{1}{n}\sum\limits_{i=1}^{n} \frac{1}{x_i}}\]
15.4 Function: Variance & Standard Deviation
- Use the above data to calculate sample variance and standard deviation.
Sample Variance
\[ \large Var(x) = s_x^2 = \frac{1}{n-1}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2 \]
Sample Standard Deviation
\[ \large s_x = \sqrt{s_x^2} = \sqrt{Var(x)} \]
15.5 Function: Summary Statistics
Use base R functions or your own custom functions, write a function that will return a vector summary statistics of the following location and dispersion estimates of a numeric :
Number of observations
Number of non-missing observations
Minimum value (Min)
Maximum value (Max)
Arithmetic mean (AM)
Geometric mean (GM)
Harmonic mean (HM)
First quartile (Q1)
Second quartile or Median (Q2)
Third quartile (Q3)
Range
Interquartile range (IQR)
Variance (Var)
Standard deviation (SD)
Coefficient of variation (CV)
You can use standard R functions in your own custom function.
15.6 Function: Correlation
Create your own function to estimate correlation between two numeric vector
Also check the output from your function with that of the
cor
function
\[ \large r_{xy} = \frac{Cov(x,y)}{\sqrt{(Var(x)Var(y)}} = \frac{Cov(x,y)}{s_xs_y} \]
\[ \large Cov(x,y) = s_{xy} = \frac{1}{n-1}\sum\limits_{i=1}^{n} (x_i-\bar{x})(y_i-\bar{y}) \]
\[ \large Var(x) = s_x^2 = \frac{1}{n-1}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2 \]
\[ \large Var(y) = s_y^2 = \frac{1}{n-1}\sum\limits_{i=1}^{n} (y_i-\bar{y})^2 \]
15.7 Function: Skewness and Kurtosis
- Create a function to calculate Skewness and Kurtosis of a numeric vector
Skewness
\[ \large Skewness = \frac{m^3}{s^3} = \frac{\frac{1}{n}\sum\limits_{i=1}^{n}(x_i-\bar{x})^3} {[\frac{1}{n}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2]^{3/2}} \]
Kurtosis
\[ \large Kurtosis = \frac{m^4}{s^4} = \frac{\frac{1}{n}\sum\limits_{i=1}^{n} (x_i-\bar{x})^4} {[\frac{1}{n}\sum\limits_{i=1}^{n} (x_i-\bar{x})^2]^{2}} \]
where \(\large \bar{x}\) is the sample mean, \(\large s\) is the sample standard deviation, and the numerator \(\large m^3\) is the sample third central moment and \(\large m^4\) is the sample fourth central moment.
15.8 Vectorised operation: t.test
Create your own function to conduct t-test that mimics the function
t.test
for simpler condition.Return the output as a list
Compare the output with the
t.test
function
Steps:
Sample Mean: \[ \large \bar{x_1} = \frac{1}{n}\sum\limits_{i=1}^{n} x_{1i} \] \[ \large \bar{x_2} = \frac{1}{n}\sum\limits_{i=1}^{n} x_{2i} \]
Pooled Sample Standard Deviation: \[ \large s = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1+n_2-2}}\] Where: \(\large s_1^2\) and \(\large s_2^2\) are variances from Sample 1 and Sample 2, respectively.
Test statistic \(t_{Cal}\) under \(\large H_O: \mu_1 = \mu_2\):
\[ \large t_{Cal} = \frac{(\bar{x_1}-\bar{x_2})} {s\sqrt{1/n_1+1/n_2}} \]
Two-tailed probability under the distribution of the test statistic: \[\large 2*pt(q = |t_{Cal}|, df = n1+n2-2, lower.tail=FALSE)\]