Section 56 Normal Distribution: R functions

To generate data from the Normal distribution, we use the rnorm function.
Note that the function rnorm uses mean and standard deviation to define the Normal distribution.

To generate n random values from a \(\large Normal(\mu, \sigma^2)\) distribution:

\[\Huge rnorm(n, \mu, \sigma)\]

To find the probability that X is less than or equal to q when \(\large X \sim Normal(\mu, \sigma^2)\):

\[\Huge pnorm(q, \mu, \sigma)\]

To find the quantile Zp, such that prob(X < Zp) = p, where \(\large X \sim Normal(\mu, \sigma^2)\):

\[\Huge qnorm(p, \mu, \sigma)\]

To find the height of the density (shape) function at X=x\(\large X \sim Normal(\mu, \sigma^2)\):

\[\Huge dnorm(x, \mu, \sigma)\]

set.seed(13579) # explained in the first exercise, and next chapter

x = rnorm(1000,mean=0,sd=1)
y = rnorm(1000,mean=3,sd=2)

The X and Y values can be viewed in a histogram or compared graphically in a boxplot.

boxplot(x,y)

If you generate enough data values (a large enough sample) the sample mean and sample variance will be very close to the true or population values.

Compare the mean and standard deviations of your X and Y values with their population values.

summary(x)

sd(x)

summary(y)

sd(y)

We can also compare sample and population probabilities. For example, to find the relative frequency of X values less than 1.5,

prop.table(table(x< 1.5))

The answer for the entire ‘population’, as opposed to our sample of 1000 values, is easily obtained by the function pnorm :

pnorm(1.5,mean=0,sd=1)

The Normal distribution is the most useful and frequently occurring distribution for continuous data.