Testing for Randomness
The Mean Successive Difference Test
Q: How can I tell if my data are random — or not?
A. Many of us get data sets with the assumption that the process was in control, i.e., the data are random about a target; or suspicion that the process was out of control, i.e., the data show some indication of nonrandomness. An obvious solution is to simply plot the data on a control chart and determine whether or not the process was indeed in control. Of course, you must first know how to construct the chart and its control limits. E2587, Standard Practice for Use of Control Charts in Statistical Process Control, or ASTM’s Manual on Presentation of Data and Control Chart Analysis, 8th Ed., will provide you with the information you need to solve the problem graphically.
But suppose that you don’t have a copy of E2587, Manual 7 or a computer program such as Minitab available to help you plot the data and check for process stability (randomness). In this case, you can use a simple statistic called the mean successive difference test. Let’s suppose that we have n data values, x1, x2, …, xn. Successive differences are just the differences x2 – x1, x3 – x2, … xn-1 – xn. (In the field of quality control, the absolute value of these differences are just the moving ranges of n = 2 typically used to construct the control limits for the individual control chart.) When we square these differences and average them, that gives us the mean successive difference. In order to construct a test statistic, we will divide the MSD by the sample variance. The result is a ratio of the sum of squares of the MSD and usual sum of squares of the sample variance. The formula looks like this:
We can use M to detect nonrandomness in any sequence of observations, typically referred to as serial correlation. If the data come from an in-control process, then the average value for M is 2. If the observations fluctuate excessively, e.g., saw-tooth pattern, then M will be large. Conversely, if the observations exhibit a long-term cycle, then M will be small. So, how can you determine whether M is too large or too small statistically? Table 1 shows lower and upper critical values for M for a variety of sample sizes and levels of significance (α), e.g., α = 0.05 level of significance is 95 percent confidence. (Note that the upper critical values are simply obtained from UM = 4 – LM.)
The mean successive difference test as described by Bennett and Franklin can address two aspects of process capability studies.1 First, the test can identify the type of nonrandomness present (too many or too few cycles). Second, given that there may be cycles or trends, the mean successive difference provides an estimate of the process variance about the cycle or trend line. In other words, the MSD implies the value of the process variance if the cycle or trend can be eliminated.
ExampleBennett and Franklin provide a dataset of plant yields that we can use to test for serial correlation. Data for n = 25 consecutive weeks were: 81.02, 80.08, 80.05, 79.70, 79.13, 77.09, 80.09, 79.40, 80.56, 80.97, 80.17, 81.35, 79.64, 80.82, 81.26, 80.75, 80.74, 81.59, 80.14, 80.75, 81.01, 79.09, 78.73, 78.45 and 79.56. The sum of squares of the 24 successive differences is 31.6772, and the usual sum of squares associated with the sample variance is 25.1343. Thus, M = 31.6772/25.1343 = 1.26. Table 1 shows that M is between the critical values of the significance levels of 0.05 (LM = 1.367) and 0.01 (LM = 1.128). Since the value of M is less than 2, we can conclude that there is a slow nonrandom fluctuation in plant yield values over this 25-week period with at least 95 percent confidence. Figure 1 shows an individual control chart from Minitab where the control limits are based on the MSD. Here we can see the slow cyclical pattern in the data suggested by M.
Figure 1 — Individual Control Chart for Plant Yield Data
1. Bennett, C.A. and Franklin, N.L., Statistical Analysis in Chemistry and the Chemical Industry, John Wiley and Sons, New York, 1954, p. 679.
Dean V. Neubauer, Corning Inc., Corning, N.Y., coordinates the DataPoints column; an ASTM International fellow, he is chairman of Committee E11 on Quality and Statistics and chairman of E11.90.03 on Publications.