14.3samplemeans

Generally the expected values or means you have calculated have been from relatively small distributions. When the population becomes very large (like that of a country) then it can be time consuming and expensive to get data from every person (for example the 2011 Australian census cost 440 million dollars).
 * Sample Means **

One way to get around this is to take a sample. For example during elections companies will survey around 1000 people about their views on each party and use these samples to approximate the entire population. That way they can take a poll every week without spending billions of dollars. The larger the sample, the more likely it will accurately approximate the population.

Lets say we have a small group of student test scores, 25, 40, 67, 70 and 98. This data has a mean of 60 and a standard deviation of 28.36. If we use a sample size of 2 we can get any of the following means: We can see that there is a large variance between sample means and the actual mean. But if we average all of our sample means: (32.5+46+47.5+61.5+53.5+55+69+68.5+82.5+84)/10 we get 60, the population mean.
 * Test Scores || Sample Mean (x̄) ||
 * 25, 40 || 32.5 ||
 * 25, 67 || 46 ||
 * 25, 70 || 47.5 ||
 * 25, 98 || 61.5 ||
 * 40, 67 || 53.5 ||
 * 40, 70 || 55 ||
 * 40, 98 || 69 ||
 * 67, 70 || 68.5 ||
 * 67, 98 || 82.5 ||
 * 70, 98 || 84 ||

With larger populations we will be unable to get every permutation of samples but if we average our sample means we will have a good estimate of the population mean.

So in general we can say **μ x̄  = μ** In general the standard deviation of the sample means is given by: math \sigma_{\overline{x}}=\dfrac{\sigma}{ \sqrt{n}} math Note: this assumes a normal distribution. It turns out that even if the population itself is not normally distributed, the sample means will be normally distributed about the population mean **if** the sample size is large enough. Typically a sample size of 30 is sufficient for the formula above to hold.

This is called the **central limit theorem.** It can be formally written as: Let x be **any** random variable, with mean μ and standard deviation σ. Then, provided the sample size n is large enough, the distribution of the sample mean x̄ is approximately normal with mean E(x̄)=μ and standard deviation

math \sigma_{\overline{x}}=\dfrac{\sigma}{ \sqrt{n}} math The proof of this is given in your textbook on page 716.

Example:
The amount of coffee, X mL, dispensed by a machine has a distribution with probability density function //f//, defined by math . \qquad |x|=\left\{ \begin{matrix} \frac{x}{20}, \text{if }160 \leqslant x \leqslant 180 \\ \\ 0, \text{ otherwise } \\ \end{matrix} \right. math

Find the probability that the average amount of coffee contained in 25 randomly chosen cups will be more than 173mL.

Solution:
The central limit theorem tells us the sample mean is approximately normal, so if we find this we can use our normal distribution to find the appropriate probability. First find the mean and standard deviation of X: math E(X)=. \qquad \displaystyle{\int\limits_{160}^{180}{\frac{x}{20} \,dx}} = 170 \\ E(X^2)=. \qquad \displaystyle{\int\limits_{160}^{180}{\frac{x^2}{20} \,dx}} = 28933.33 \\ \text{so } \sigma =\sqrt{28933.33-170^2} = 5.77 math By the central limit theorem we can say that the sample mean is approximately normally distributed with a mean of math E(\overline{X})=E(X) =170 \text{ and } \sigma_{\overline{X}}=\dfrac{\sigma}{\sqrt{n}}=\dfrac{5.77}{5} =1.15 \\ math Therefore (using CAS or other methods) math Pr(\overline{X} >173)=0.0045 math

