Section 56 Normal Distribution: R functions


  • To generate data from the Normal distribution, we use the rnorm function.

  • Note that the function rnorm uses mean and standard deviation to define the Normal distribution.


  • To generate n random values from a \(\large Normal(\mu, \sigma^2)\) distribution:

\[\Huge rnorm(n, \mu, \sigma)\]


  • To find the probability that X is less than or equal to q when \(\large X \sim Normal(\mu, \sigma^2)\):

\[\Huge pnorm(q, \mu, \sigma)\]


  • To find the quantile Zp, such that prob(X < Zp) = p, where \(\large X \sim Normal(\mu, \sigma^2)\):

\[\Huge qnorm(p, \mu, \sigma)\]


  • To find the height of the density (shape) function at X=x\(\large X \sim Normal(\mu, \sigma^2)\):

\[\Huge dnorm(x, \mu, \sigma)\]


  • Equivalent functions are available in R for all the common distributions.
set.seed(13579) # explained in the first exercise, and next chapter

x = rnorm(1000,mean=0,sd=1)
y = rnorm(1000,mean=3,sd=2)


  • The X and Y values can be viewed in a histogram or compared graphically in a boxplot.
boxplot(x,y)


  • If you generate enough data values (a large enough sample) the sample mean and sample variance will be very close to the true or population values.


  • Compare the mean and standard deviations of your X and Y values with their population values.
summary(x)

sd(x)

summary(y)

sd(y)


  • We can also compare sample and population probabilities. For example, to find the relative frequency of X values less than 1.5,
prop.table(table(x< 1.5))


The answer for the entire ‘population’, as opposed to our sample of 1000 values, is easily obtained by the function pnorm :

pnorm(1.5,mean=0,sd=1)


The Normal distribution is the most useful and frequently occurring distribution for continuous data.