**Abstract**

This practice provides statistical methodology for conducting equivalence testing on numerical data from two sources to determine if their true means or variances differ by no more than predetermined limits. This standard provides guidance on experiments and statistical methods needed to demonstrate that the test results from a modified testing process are equivalent to those from the current testing process, where equivalence is defined as agreement within a prescribed limit, termed an equivalence limit.

**Significance and Use**

4.1 Laboratories conducting routine testing have a continuing need to make improvements in their testing processes. In these situations it must be demonstrated that any changes will neither cause an undesirable shift in the test results from the current testing process nor substantially affect a performance characteristic of the test method. This standard provides guidance on experiments and statistical methods needed to demonstrate that the test results from a modified testing process are equivalent to those from the current testing process, where equivalence is defined as agreement within a prescribed limit, termed an equivalence limit.

4.1.1 The equivalence limit, which represents a worst-case difference or ratio, is determined prior to the equivalence test and its value is usually set by consensus among subject-matter experts.

4.1.2 Examples of modifications to the testing process include, but are not limited, to the following:

*(1)* Changes to operating levels in the steps of the test method procedure,

*(2)* Installation of new instruments, apparatus, or sources of reagents and test materials,

*(3)* Evaluation of new personnel performing the testing, and

*(4)* Transfer of testing to a new location.

4.1.3 Examples of performance characteristics directly applicable to the test method include bias, precision, sensitivity, specificity, linearity, and range. Additional characteristics are test cost and elapsed time needed to conduct the test procedure.

4.2 Equivalence studies are performed by a designed experiment that generates test results from the modified and current testing procedures on the same types of materials that are routinely tested. The design of the experiment depends on the type of equivalence needed as discussed below. Experiment design and execution for various objectives is discussed in Section 5.

4.2.1 Means equivalence is concerned with a potential shift in the mean test result in either direction due to a modification in the testing process. Test results are generated under repeatability conditions by the modified and current testing processes on the same material, and the difference in their mean test results is evaluated.

4.2.1.1 In situations where testing cannot be conducted under repeatability conditions, such as using in-line instrumentation, test results may be generated in pairs of test results from the modified and current testing processes, and the mean differences among paired test results are evaluated.

4.2.2 Slope equivalence evaluates the slope of the linear statistical relationship between the test results from the two testing procedures. If the slope is equivalent to the value one (1), then the two testing processes meet slope equivalence.

4.2.3 Range equivalence evaluates the differences in means over a selected wider range of test results and the experiment uses materials that cover that range. The combination of slope equivalence and means equivalence defines range equivalence.

4.2.4 Non-inferiority is concerned with a difference only in the direction of an inferior outcome in a performance characteristic of the modified testing procedure versus the current testing procedure. Non-inferiority may involve the comparisons of means, standard deviations, or other statistical parameters.

4.2.4.1 Non-inferiority studies may involve trade-offs in performance characteristics between the modified and current procedures. For example, the modified process may be slightly inferior to the established process with respect to assay sensitivity or precision but may have off-setting advantages such as faster delivery of test results or lower testing costs.

4.3 Risk Management—Guidance is provided for determining the amount of data required to control the risks of making the wrong decision in accepting or rejecting equivalence (see 5.4 and Section A1.2).

4.3.1 The consumer’s risk is the risk of falsely declaring equivalence. The probability associated with this risk is directly controlled to a low level so that accepting equivalence gives a high degree of assurance that the true difference is less than the equivalence limit.

4.3.2 The producer’s risk is the risk of falsely rejecting equivalence. The probability associated with this risk is controlled by the amount of data generated by the experiment. If valid improvements are rejected by equivalence testing, this can lead to opportunity losses to the company and its laboratories (the producers) or cause unnecessary additional effort in improving the testing process.

**1. Scope**

1.1 This practice provides statistical methodology for conducting equivalence studies on numerical data from two sources of test results to determine if their true means, variances, or other parameters differ by no more than predetermined limits.

1.2 Applications include (1) equivalence studies for bias against an accepted reference value, (2) determining means equivalence of two test methods, test apparatus, instruments, reagent sources, or operators within a laboratory or equivalence of two laboratories in a method transfer, and (3) determining non-inferiority of a modified test procedure versus a current test procedure with respect to a performance characteristic.

1.3 The guidance in this standard applies to experiments conducted either on a single material at a given level of the test result or on multiple materials covering a selected range of test results.

1.4 Guidance is given for determining the amount of data required for an equivalence study. The control of risks associated with the equivalence decision is discussed.

1.5 The values stated in SI units are to be regarded as standard. No other units of measurement are included in this standard.

1.6 *This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.*

1.7 *This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.*

**2. Referenced Documents** *(purchase separately)* The documents listed below are referenced within the subject standard but are not provided as part of the standard.

**ASTM Standards**

E122 Practice for Calculating Sample Size to Estimate, With Specified Precision, the Average for a Characteristic of a Lot or Process

E177 Practice for Use of the Terms Precision and Bias in ASTM Test Methods

E456 Terminology Relating to Quality and Statistics

E2282 Guide for Defining the Test Result of a Test Method

E2586 Practice for Calculating and Using Basic Statistics

E3080 Practice for Regression Analysis with a Single Predictor Variable

**USP Standard**

**ICS Code**

ICS Number Code 03.120.30 (Application of statistical methods)

**DOI:** 10.1520/E2935-21

ASTM E2935-21, Standard Practice for Evaluating Equivalence of Two Testing Processes, ASTM International, West Conshohocken, PA, 2021, www.astm.org

