### Standard Practice for Calculating and Using Basic Statistics

Active Standard ASTM E2586 | Developed by Subcommittee: E11.10

Book of Standards Volume: 14.05

 Format Pages Price PDF 22 \$75.00 ADD TO CART Hardcopy (shipping and handling) 22 \$75.00 ADD TO CART

Abstract

This practice covers methods and equations for computing and presenting basic statistics. This practice includes simple descriptive statistics for variable and attribute data, elementary methods of statistical inference, and tabular and graphical methods for variable data. Some interpretation and guidance for use is also included.

This practice provides approaches for characterizing a sample of n observations that arrive in the form of a data set. Large data sets from organizations, businesses, and governmental agencies exist in the form of records and other empirical observations. Research institutions and laboratories at universities, government agencies, and the private sector also generate considerable amounts of empirical data.

This abstract is a brief summary of the referenced standard. It is informational only and not an official part of the standard; the full text of the standard itself must be referred to for its use and application. ASTM does not give any warranty express or implied or make any representation that the contents of this abstract are accurate, complete or up to date.

Significance and Use

4.1 This practice provides approaches for characterizing a sample of n observations that arrive in the form of a data set. Large data sets from organizations, businesses, and governmental agencies exist in the form of records and other empirical observations. Research institutions and laboratories at universities, government agencies, and the private sector also generate considerable amounts of empirical data.

4.1.1 A data set containing a single variable usually consists of a column of numbers. Each row is a separate observation or instance of measurement of the variable. The numbers themselves are the result of applying the measurement process to the variable being studied or observed. We may refer to each observation of a variable as an item in the data set. In many situations, there may be several variables defined for study.

4.1.2 The sample is selected from a larger set called the population. The population can be a finite set of items, a very large or essentially unlimited set of items, or a process. In a process, the items originate over time and the population is dynamic, continuing to emerge and possibly change over time. Sample data serve as representatives of the population from which the sample originates. It is the population that is of primary interest in any particular study.

4.2 The data (measurements and observations) may be of the variable type or the simple attribute type. In the case of attributes, the data may be either binary trials or a count of a defined event over some interval (time, space, volume, weight, or area). Binary trials consist of a sequence of 0s and 1s in which a “1” indicates that the inspected item exhibited the attribute being studied and a “0” indicates the item did not exhibit the attribute. Each inspection item is assigned either a “0” or a “1.” Such data are often governed by the binomial distribution. For a count of events over some interval, the number of times the event is observed on the inspection interval is recorded for each of n inspection intervals. The Poisson distribution often governs counting events over an interval.

4.3 For sample data to be used to draw conclusions about the population, the process of sampling and data collection must be considered, at least potentially, repeatable. Descriptive statistics are calculated using real sample data that will vary in repeating the sampling process. As such, a statistic is a random variable subject to variation in its own right. The sample statistic usually has a corresponding parameter in the population that is unknown (see Section 5). The point of using a statistic is to summarize the data set and estimate a corresponding population characteristic or parameter, or to test a hypothesis.

4.4 Descriptive statistics consider numerical, tabular, and graphical methods for summarizing a set of data. The methods considered in this practice are used for summarizing the observations from a single variable. The descriptive statistics described in this practice are: mean, median, min, max, range, mid range, order statistic, quartile, empirical percentile, quantile, interquartile range, variance, standard deviation, Z-score, coefficient of variation, and skewness and kurtosis.

4.5 Statistical inference is drawing conclusions about the population or its parameters. Methods for statistical inference described in this practice are: degrees of freedom, standard error, confidence intervals, prediction intervals, tolerance intervals, and statistical hypothesis tests.

4.6 Tabular methods described in this practice are: frequency distribution, relative frequency distribution, cumulative frequency distribution, and cumulative relative frequency distribution.

4.7 Graphical methods described in this practice are: histogram, ogive, boxplot, dotplot, normal probability plot, and q-q plot.

4.8 While the methods described in this practice may be used to summarize any set of observations, the results obtained by using them may be of little value from the standpoint of interpretation unless the data quality is acceptable and satisfies certain requirements. To be useful for inductive generalization, any sample of observations that is treated as a single group for presentation purposes must represent a series of measurements, all made under essentially the same test conditions, on a material or product, all of which have been produced under essentially the same conditions. When these criteria are met, we are minimizing the danger of mixing two or more distinctly different sets of data.

4.8.1 If a given collection of data consists of two or more samples collected under different test conditions or representing material produced under different conditions (that is, different populations), it should be considered as two or more separate subgroups of observations, each to be treated independently in a data analysis program. Merging of such subgroups, representing significantly different conditions, may lead to a presentation that will be of little practical value. Briefly, any sample of observations to which these methods are applied should be homogeneous or, in the case of a process, have originated from a process in a state of statistical control.

4.9 The methods developed in Sections 6, 7, 8, and 9 apply to the sample data. There will be no misunderstanding when, for example, the term “mean” is indicated, that the meaning is sample mean, not population mean, unless indicated otherwise. It is understood that there is a data set containing n observations. The data set may be denoted as:

4.9.1 There is no order of magnitude implied by the subscript notation unless subscripts are contained in parenthesis (see 6.7).

1. Scope

1.1 This practice covers methods and equations for computing and presenting basic statistics. This practice includes simple descriptive statistics for variable and attribute data, elementary methods of statistical inference, and tabular and graphical methods for variable data. Some interpretation and guidance for use is also included.

1.2 The system of units for this practice is not specified. Dimensional quantities in the practice are presented only as illustrations of calculation methods. The examples are not binding on products or test methods treated.

1.3 This standard does not purport to address all of the safety concerns, if any, associated with its use. It is the responsibility of the user of this standard to establish appropriate safety, health, and environmental practices and determine the applicability of regulatory limitations prior to use.

1.4 This international standard was developed in accordance with internationally recognized principles on standardization established in the Decision on Principles for the Development of International Standards, Guides and Recommendations issued by the World Trade Organization Technical Barriers to Trade (TBT) Committee.

2. Referenced Documents (purchase separately) The documents listed below are referenced within the subject standard but are not provided as part of the standard.

ASTM Standards

E178 Practice for Dealing With Outlying Observations

E456 Terminology Relating to Quality and Statistics

E2234 Practice for Sampling a Stream of Product by Attributes Indexed by AQL

E2282 Guide for Defining the Test Result of a Test Method

E3080 Practice for Regression Analysis with a Single Predictor Variable

ISO Standards

ISO3534-1 StatisticsVocabulary and Symbols, part 1: Probability and General Statistical Terms

ISO3534-2 StatisticsVocabulary and Symbols, part 2: Applied Statistics

ICS Code

ICS Number Code 03.120.30 (Application of statistical methods)

##### Referencing This Standard
 Link Here Link to Active (This link will always route to the current Active version of the standard.)

DOI: 10.1520/E2586-19E01

Citation Format

ASTM E2586-19e1, Standard Practice for Calculating and Using Basic Statistics, ASTM International, West Conshohocken, PA, 2019, www.astm.org