Addressing Measurement Uncertainty in ASTM International’s Test Methods
Uncertainty of measurement has become an increasingly important concern to many ASTM International committees and the users of ASTM standards. This paper will discuss some background on how the current ASTM policy developed and describe the role and responsibilities of ASTM committees in developing test methods.
Some Historical Background
Measuremant uncertainty has had a long history in the standards community. In the 1930s, a subcommittee of then-ASTM Committee E01 on Methods of Testing prepared the ASTM Manual on Presentation of Data, which included “Supplement A Presenting ± Limits of Uncertainty of an Observed Average.” Originally published as part of ASTM Special Technical Publication 15, this section now is a part of ASTM Manual 7A, Manual on Presentation of Data and Control Chart Analysis, Seventh Edition, a publication of ASTM Committee E11 on Quality and Statistics.
The concepts described here are extensions of what was presented by Thomas Simpson in 1755 and discussed in Stigler’s History of Statistics: The Measurement of Uncertainty Before 1900. Although the common assumption at that time was that a “good” astronomer did not need to take more than one reading, Simpson showed how using an average of several readings could reduce the error or uncertainty of practical astronomical measurements.
The 1960s saw a burst of activity in defining and estimating precision, accuracy, and uncertainty. This work occurred primarily at the then-National Bureau of Standards (NBS) (now the National Institute of Standards and Technology (NIST)) Statistical Engineering Laboratory with a number of papers by Churchill Eisenhart,1, 2 W.J. Youden,3 and Harry Ku,4 appearing in the NBS Special Publication 300.
In addition, the NBS Handbook 91, written by M. G. Natrella,5 contained an entire chapter titled “Expression of Uncertainties of Final Results.” These individuals, as well as others from NBS, were active in ASTM Committee E11 and made significant contributions to standards developed by the committee.
The general focus of the NBS articles indicated how important it was to determine the precision of measurement and test methods. Accuracy was seen as a combination of the precision (closeness of test results) and a systematic difference (bias). The uncertainty of a reported value was then presented as indicated by giving “credible limits to its likely inaccuracy.” However, there also was the recognition that “No single form of expression for these limits is universally satisfactory.”5 These ideas were concisely captured in the definition of uncertainty that appears in the original 1972 version of the E11 terminology standard E 456, Terminology for Relating to Quality and Statistics, and continues today:
uncertainty, nan indication of the variability associated with a measured value that takes into account two major components of error: (1) bias, and (2) the random error attributed to the imprecision of the measurement process.
Discussion Quantitative measures of uncertainty generally require descriptive statements of explanation because of differing traditions of usage and because of differing circumstances. For example: (1) the bias and imprecision may both be negligible; (2) the bias may not be negligible while the imprecision is negligible; (3) neither the bias nor the imprecision may be negligible; (4) the bias may be negligible while the imprecision is not negligible.
The Metrological Approach to Uncertainty
In 1993, the Guide to the Expression of Uncertainty in Measurements (commonly called the GUM) was adopted by the International Organization for Standardization (ISO) Technical Advisory Group on Metrology. This was primarily developed by the national metrology laboratories largely to provide a basis for international comparison of measurement results. It should be noted that measurement in the metrology world has a very specific meaning and application, which we will be discussing shortly. This is to be contrasted with the broader nature of test results that occur when applying a test method.
At the same time, NIST adopted a companion document, NIST Technical Note 1297, “Guidelines for Evaluating and Expressing the Uncertainty of NIST Measurement Results,” for use at its facilities. This established a broad mandate that “uncertainty estimates” be included with all NIST measurements. Since many members of NIST were also very active on ASTM committees, this policy began to impact some of those committees. One early question was whether the GUM-type uncertainty estimate could be substituted for the interlaboratory precision statement. In particular, this issue was raised within Committee C16 on Thermal Insulation and then conveyed to E11 for clarification.
In the spring of 1994 a task group on measurement uncertainty was formed in Subcommittee E11.91 on Long Range Planning to initiate work on a standard for use in ASTM. At the May 1995 E11 meeting, the task group met and discussed the relationship of the uncertainty and interlaboratory principles and documents. Barry Taylor and Chris Kuyatt, the authors of the NIST Technical Note, conducted a tutorial on the ISO and NIST documents at the meeting and provided input to E11 on the GUM approach.
Both the NIST technical note and the GUM guide describe two methodologies for determining contributions to the estimate they define as “uncertainty.” These are classified as Type A and Type B. Type A contributions are found using actual data from measurements, much as would be developed from a ruggedness test, control charts, or some other formal study. The Type B components are from outside sources and can include anything from theoretical analyses to manufacturer’s statements about typical precision of instruments. These guides then describe ways to combine the separate components.
There was no question within E11 that this was going to be an issue for ASTM. In addition, uncertainty was emerging as a critical concern in the accreditation of laboratories, especially calibration laboratories. By 1995, the American Association for Laboratory Accreditation (A2LA) drafted a new policy calling for uncertainty to be documented and estimated by the laboratories they accredited.
E11’s initial deliberations resulted in the inclusion of a short statement in the 1996 revision of E 1488, Guide for Statistical Procedures to use in Developing and Applying Test Methods, stating that “the techniques for developing estimates of precision recommended in this guide are only of the Type A uncertainty evaluation as described in the ISO Guide to the Expression of Uncertainty in Measurement.”
In 1999, ISO 17025, General Requirements for Accreditation of Laboratories, was widely adopted. Among its requirements was the stipulation that uncertainty estimates be provided by testing laboratories. One relevant section, 184.108.40.206, stated in part that “testing laboratories shall have and shall apply procedures for estimating uncertainty of measurement.” 6
In addition, Note 2 to this section opened up some additional issues for ASTM policy. The note reads as follows:
In those cases where a well-recognized test method specifies limits to the values of the major sources of uncertainty of measurement and specifies the form of presentation of calculated results, the laboratory is considered to have satisfied this clause by following the test method and reporting instructions.
This statement raised the question of whether the precision data in a test method could be used to satisfy a laboratory’s estimate of uncertainty if the test method was properly followed as described in the note.
In February 2001, a formal presentation and request was made to the ASTM Committee on Standards (COS), a standing committee of the ASTM board of directors, asking that ASTM address the issue of measurement uncertainty.7 A COS task group consisting of eight members from various ASTM committees (including E11), NIST, and an A2LA assessor was established “to provide guidance to address measurement uncertainty in the Form and Style for ASTM Standards.” After considerable deliberation, the task group prepared a proposal in March of 2003 for COS to distribute in a circular letter.
Parallel with this effort, E11 convened a separate task group to study the proposals being made in the COS task group and to determine the appropriate course of action for E11. Committee E11 efforts led to the preparation of an independent proposal, which was also distributed in the March 2003 COS circular letter.
After reviewing all the comments, the COS task group and E11 leadership developed a revised proposal that was adopted in 2004 as the new non-mandatory section A22 of Form and Style for ASTM Standards (see sidebar at left).
Further E11 Activities
In the spring of 2002, Committee E11 held a symposium on measurement uncertainty to examine all the current issues, with presentations by NIST staff members and other measurement practitioners.
One of the overriding principles involves the difference between the metrologist’s “measurement methods” and ASTM “test methods.” This is more than just terminology. In the metrology world measurements “determine the value of a quantity,” which focuses on primary units such as length, time, and mass. These also involve traceable systems going all the way to a national standards laboratory such as NIST and even to comparisons between national laboratories.
Tests, as in the application of an ASTM test method, however, deal with more general “characteristics” leading to a “test result” that is generally dependent on specific instruments and procedures and which rarely have a traceable reference standard. This distinction is important in recognizing the differences in assumptions underlying the development of numerical values for uncertainty.
Discussions within Committee E11 led to a general agreement that uncertainty, as described by the GUM, applies only to the generation of test results at a given laboratory and are not to be taken as a general measure of the performance of the test method. It was also felt that since an ASTM standard cannot provide enough information about the magnitude of effects that might occur in all laboratories, using various types of equipment, operators, or other factors, a standard should only provide guidance on estimating uncertainty in an individual laboratory.
This identification of uncertainty with a specific laboratory’s long-term variation of a test method can be equated with “intermediate precision.” The term intermediate precision is defined as “the closeness of agreement between test results obtained under specified intermediate precision conditions” (see ASTM standard E 177, Practice for Use of the Terms Precision and Bias in ASTM Test Methods). These conditions may include “operator, measuring equipment, location within the laboratory, and time,” but are to be associated with a particular laboratory’s intermediate measure of precision. Furthermore, the discussion of intermediate precision in E 177 clearly recognizes this connection to a specific laboratory with the following statement:
DiscussionBecause the training of operators, the agreement of different pieces of equipment in the same laboratory and the variation of environmental conditions with longer time intervals all depend on the degree of within-laboratory control, the intermediate measures of precision are likely to vary appreciably from laboratory to laboratory. Thus, intermediate precisions may be more characteristic of individual laboratories than of the test method. (E 177, paragraph 220.127.116.11)
A Guide for Statistical Procedures
One way Committee E11 began to address the issues of uncertainty was to begin the process of revising E 1488, Guide for Statistical Procedures to Use in Developing and Applying ASTM Test Methods, to incorporate the important principles of measurement uncertainty. E 1488 was designed as a roadmap of the various statistical methods that can aid in the development of a test method. During the consideration of where uncertainty might fit in the sequence of activities, the concept emerged that the creation of a new standard involves a series of specific phases: design, development, validation, and evaluation, with uncertainty being addressed in the validation phase.
In the revision to E 1488, the validation phase is described as the point when:
"The test method is examined for such concerns as its stability, ruggedness, statistical control and the contributions to variability. The completion of this phase should result in preliminary estimates of precision and the identification and suggested ways to estimate potential contributors to uncertainty." (Sec 5.4.1 italics added)
Section 14 of E 1488 provides more specific detail, stating, “standards developers are not expected to provide numerical values that will satisfy uncertainty estimation for any particular laboratory.” This section then suggests, however, that the committee should provide information that will help test method users develop their own assessment, since this is dependent on specific conditions within a given laboratory. If any laboratories have determined uncertainty estimates, they should be encouraged to provide information on how it was accomplished and give data for guidance purposes. Such information would best be placed in an appendix where it could provide valuable insight for ASTM standards users, but should never be relied on as an absolute value that could be substituted for uncertainty estimates obtained by the individual laboratory.
None of this directly affects the final evaluation phase that involves the interlaboratory study resulting in the precision estimates incorporated in an ASTM test method.
Committee E11 has recognized that consensus standards are critical to establishing agreement on principles and wide application of practices. Early in the struggle to understand how to apply the new uncertainty principles to test methods, it was recognized that no clear, simple to follow, broad-based formal practices exist. Our model to follow has been the E11 standards E 177 and E 691, Practice for Conducting an Interlaboratory Study to Determine the Precision of a Test Method, which describe precision and how to conduct interlaboratory tests to estimate the test method precision. E11 now has two major efforts under way that focus on uncertainty and parallel these standards.
The first, WK3539, Draft Practice for Reporting Uncertainty of Test Results and Use of the Term Measurement Uncertainty in ASTM Test Methods, is a general document intended to present concepts of uncertainty and describe how they relate to precision and bias. In addition, it will provide guidance on what types of information should be incorporated in an ASTM test method.
The second, WK3561, Practice for Estimating the Uncertainty of a Test Result Using Control Chart Techniques, will provide a practice for estimating within-laboratory uncertainty using control chart techniques from repeated test results on a control sample.
In many cases this would be a natural extension of current quality control practices within a laboratory. Thus, many of the possible variables associated with running a test method will naturally be encountered. Where it may be desirable to ensure that additional variables are considered, systematic changes to the operating conditions may be incorporated. This will further ensure that the estimate of uncertainty describes the potential variation of the method in that laboratory.
This estimate should be equivalent to the intermediate precision of analytical chemistry test methods. The standard should also provide methodology for the ongoing monitoring and evaluation of the uncertainty estimates using a modification of the standard control chart.
Finally, Committee E11 will need to provide important education and assistance in implementing the practices once they are adopted. This is a tall order, considering the extremely diverse nature of ASTM test methods. We also recognize that other approaches should be examined and developed. E11 remains committed to these efforts and invites all to participate in our work. //
1 Churchill Eisenhart “Realistic Evaluation of the Precision and Accuracy of Instrument Calibration Systems” NBS Publication 300
2 Churchill Eisenhart “Expression of the uncertainties of final results” NBS Publication 300
3 W.J. Youden “Uncertainties in calibration” NBS Publication 300
4 Harry Ku “Expressions of imprecision, systematic error, and uncertainty associated with a reported value” NBS Publication 300
5 M.G. Natrella NBS Experimental Statistics book (Handbook 91) Chapter 23 Expression of Uncertainties of Final Results
6 ISO 17025 18.104.22.168
“In certain cases the nature of the test method may preclude rigorous, metrologically and statistically valid, calculation of uncertainty of measurement. In these cases the laboratory shall at least attempt to identify all the components of uncertainty and make a reasonable estimation, and shall ensure that the form of reporting of the result does not give a wrong impression of the uncertainty. Reasonable estimation shall be based on knowledge of the performance of the method and on the measurement scope and shall make use of, for example, previous experience and validation data.”
7 Circular Letter 720