July/August 2011 ## Statistical Intervals## Part 1: The Confidence Interval
## Statistical intervals play an important role in statistical inference. But practitioners often confuse the difference among confidence, prediction and tolerance type intervals. This article series will clarify these differences and indicate when each type of interval should be used.
A: The most common type of statistical interval used in practice is the confidence interval (CI). The manner in which the CI end points are calculated is context dependent. If the parameter of interest is a mean and the standard deviation is known, then the CI calculation is from Equation 1, where the standard normal distribution is used to produce the desired confidence level. , with confidence 100(1 - α)% (1) The quantity z
This CI indicates that the true value for the mean impact energy for A238 steel at 60 Since σ is typically unknown, we must estimate it with the sample standard deviation , with confidence 100(1 - α)% (2) The quantity t The interval construction in Equation 2 is said to capture or contain the unknown mean, μ, with the stated confidence 100(1 - α)%. For sample mean and sample standard deviation s, we can write a probability statement for Equation 2, assuming that the sample comes from a normal distribution or that the sample size is at least 30. This statement takes the form of Equation 3. (3) Equation 3 is a true probability statement because and s are random variables in this expression. We appeal to the process of implementing Equation 3 many times and define confidence as the long run proportion of the cases that the parameter would be captured by these intervals. In such a process, approximately 100(1 - α)% of the time the interval in Equation 3 would capture the true mean,
The sample statistics are: = 13.71 and Based on the central limit theorem, we can also state a more general form for a confidence interval for any parameter of interest, say (4) Suppose that we wish to construct a confidence interval for a population proportion, (5)
To summarize, a confidence interval for an unknown parameter of a statistical model is a random interval that probably covers or contains the true value of the parameter for the sample we are using. The containment probability, called confidence, does not apply to the particular interval we are calculating. Confidence applies to the run proportion of such intervals that would be expected to cover the parameter if we were to use many samples of the same size under the same conditions.
In the next article in this series, we will discuss prediction intervals for a future observation. |
|||||||