14.4ConfidenceInt

Remember from Methods:
 * Confidence Intervals **

68% Confidence Interval

Approximately 68% of the data lies within __**one**__ standard deviation of the mean. math . \qquad Pr \big( \mu - \sigma < X < \mu + \sigma \big) \approx 0.68 math

95% Confidence Interval

Approximately 95% of the data lies within __**two**__ standard deviations of the mean. math . \qquad Pr \big( \mu - 2\sigma < X < \mu + 2\sigma \big) \approx 0.95 math

99.7% Confidence Interval

Approximately 99.7% of the data lies within __**three**__ standard deviations of the mean. math . \qquad Pr \big( \mu - 3\sigma < X < \mu + 3\sigma \big) \approx 0.997 math

math . \qquad z=\dfrac{x-\mu}{\sigma} math
 * AND** to convert between any normal distribution and the standard normal distribution we use:

Confidence Intervals for Population Mean The central limit theorem tells us that sample means are normally distributed (for large n). This allows us to determine confidence intervals for the population mean given information from a sample mean.

Lets say we want to look at the IQ scores of Victorian VCE students but do not have the budget to survey all 20+ thousand students. If we took a random sample of 100 students and found that the mean IQ was 108.6 we can use this as an estimate for the population mean. Given this is a single value it is called a **point estimate** of μ.

We know from prior examples that sample means might be close to the population mean but can sometimes be quite different. Hence using just one value is risky. It is often much better to use a range of values that we are fairly sure will contain the population mean. This is called an **interval estimate or a confidence interval** for the population mean.

If we examine the 95% confidence interval to begin with (most commonly used one) we can say from the standard normal distribution that Pr(-1.96 Where x̄ is the sample mean, μ is the population mean, σ is the population standard deviation and n is the sample size.

In other words we can be 95% confident that the population mean is somewhere between the interval values.

In general, a C% confidence interval is given by math \big( \bar{x} - k \dfrac{\sigma}{\sqrt{n}}, \bar{x}+ k \dfrac{\sigma}{\sqrt{n}} \big) \\ math where k is such that math Pr(-k<Z<k) = \dfrac{C}{100} \\ math From this we can see that if we increase the confidence level needed then we will increase k and hence widen the confidence interval.

This is the distance between the sample mean and the endpoints of the interval. It is simply given as math M=k \times \dfrac{\sigma}{\sqrt{n}} \\ \text{note k =1.96 for 95% confidence} \\ math We can rearrange this to get a useful formula telling us how big our sample size must be to get an appropriate margin of error: math n= \big( \dfrac{k \sigma}{M} \big)^2 \\
 * Margin of Error (also known as standard error)**

math Returning to our sample of 100 VCE students with a sample mean of 108.6. Assume the population standard deviation is 15 (standard for IQ tests). a) Find the approximate 95% confidence interval for the mean IQ of VCE students b) Find the approximate 95% confidence interval for the mean IQ if we surveyed 400 students and got the **same** sample mean. c) Find the number of students we would need to sample if we wanted a margin of error of 1.5 points or less at the 95% level. d) Find the 99% confidence interval for the meain IQ of VCE students (n=100)
 * Example 1 **

a) At the 95% level k=1.96, we also know σ is 15 and μ is 108.6. Subbing these into our confidence interval formula we get: math \big( 108.6-1.96 \times \dfrac{15}{ \sqrt{100}}, 108.6+1.96 \times \dfrac{15}{ \sqrt{100}} \big) \\ = (105.66,111.54) \\ math In other words we can be 95% confident that the population mean is between 105.66 and 111.54.
 * Solution **

b) As above but with n=400 gives (107.13, 110.07). Note that the interval has narrowed as we have a higher sample size and hence can be more certain that the sample mean is near to the population mean.

c) Margin of error, M, needed is 1.5, z=1.96 and standard deviation is 15 hence: math n= \big( \dfrac{1.96 \times 15}{1.5} \big) ^2 = 384.16 math Hence we need a sample size of 385 (round up) students to be within 1.5 points in a 95% confidence interval.

d) At the 99% level: For Pr(-k<Z<k)=0.99 we get k = 2.58 (use inverse standard normal distribution with a center tail setting) Therefore our interval is: math \big( 108.6-2.58 \times \dfrac{15}{ \sqrt{100}}, 108.6+2.58 \times \dfrac{15}{ \sqrt{100}} \big) \\ = (104.74,112.46) \\ math Note that the interval has widened as we need a higher certainty level.