Improving Turbidity Measurement Reports A Round Robin Study
ASTM International Committee D19 on Water has several active subcommittees that focus on turbidity. Recently, two new low-level turbidity methods have been completed that focus on unique measurement practices for the optimized measurement of up to five turbidity units. For high-level turbidity, three other task groups are currently developing new methods. The new and developing methods focus on the turbidimetric application, which include static (measurement of a captured sample), dynamic (sample delivered to an instrument for measurement) and in-situ (measurement in the process itself).
This article describes the round robin that was designed and successfully executed by ASTM Subcommittee D19.07. Unlike earlier round robin studies, this round robin was modified to bring all the “laboratories” to a common location for data collection and the testing of other procedures that were provided by the new high-level method. The article summarizes the uses of turbidity measurement, the round robin study and its modifications from earlier round robins, and what was learned from the data and the protocol.
What Is Turbidity and How Is It Used?
As defined by proposed standard WK2576, Test Method for the Determination of Turbidity Above 1 Turbidity Unit (TU) in Static Mode, turbidity is “an expression of the optical properties in a sample that causes light rays to be scattered and absorbed rather than transmitted in straight lines through the sample. Turbidity in water is caused by the presence of suspended and dissolved matter such as clay, silt, finely divided organic matter, plankton, other microscopic organisms, organic acids and dyes.” Depending on the scope and application of the turbidity measurement, absorbance due to color may or may not contribute to turbidity.
Turbidity is recognized and used as a key indicator of water quality. It has a broad application that spans environmental quality, public health and process control quality. In terms of public health, turbidity has been directly correlated to the risk of pathogen intrusion into public water supplies and has been a stringently regulated parameter. Historically, from a regulatory perspective, little has changed over the past several decades, though much has changed in the development of new technologies.
There are many types of turbidity technologies in use that can deliver different results for similar samples taken from the same source. The difference in results can be significant, depending on which technologies are compared. Sample characteristics such as size, shape, refractive index and concentration, among others, can impact detection sensitivity and the final results of a given technology. Because of these differences, particular interferences may be reduced or eliminated by changing to a different technology. Understanding which technologies are capable of eliminating interferences and which do not will be increasingly important. These concerns have caused the turbidity parameter to be subjected historically to many daunting questions such as:
• How will one technology compare to another when measuring the same sample?
• What is the realistic operating range for a given technology?
• Which instrument detection angles are best suited for a given sample type?
• What effect does a particular color within a given sample have on the result?
• Do rapidly settling particles cause a bias in the true turbidity?
• What is the impact of dilution on a highly turbid/colored sample?
To address these problems facing high level turbidity, manufacturers and policy makers should keep in mind that turbidity is recognized and used as a key indicator of water quality. Because different technologies can deliver different results for similar samples, the technology used should be reported along with the measurement value.
Many turbidity measurement technologies have been designed for applications outside the regulatory arena and have different technological features. The key features are light source, detector type, number of detectors, detector geometries, detector angles and sample path length. Changing any or a combination of these features can deliver a different turbidimetric result. In the development of the high level static method, ASTM and the U.S. Geological Survey worked together to develop a set of traceable reporting units for specific turbidity technologies resulting in a new methodology.1 A specific reporting unit represents a type of technology. Table 1 contains the list of the established turbidity measurement technologies and respective reporting units.
In low-level turbidity round robins, higher agreements were achieved because the number of interferences was lower. In samples with high turbidities, where color and particulate absorbance can be significant, large differences in results from different technologies are observed. The study goals for this high-level round robin focused on three criteria: the identification of technologies that generally agreed with each other and those technologies that deviated in measurement across a broad array of samples, the determination of the optimum range of measurement for each technology type and the determination of the impact that sample preparation would have on the analytical results.
The draft ASTM test method undergoing the round robin study assigns turbidity reporting units that are categorized by different technological designs. The task group had the goal of incorporating all new designs that are commercially available, while allowing the use of historically acceptable designs, some of which form the basis of most regulated methods. These technologies and their reporting units can be found in Table 1. This reporting protocol has already found acceptance within the USGS and is now used as part of their technology traceability.
To be approved as an ASTM standard, this draft test method was required to undergo a round-robin study. However, a round robin as described in D 2777, Practice for Determination of Precision and Bias of Applicable Test Methods of Committee D19 on Water, could not be performed in the classical sense. This study had to be modified to accommodate samples from the field, which are known to be unstable over time. To achieve this, the round robin focused on the assembly of participants at a common location, aiding the reduction of variations caused by a lack of stability over time.
Most turbidimeter manufacturers from within the United States participated in this round robin. Many instruments that are compliant to the European Union standard for turbidity, ISO 7027, were also represented. This included technologies that represented infrared (830 880 nm) and tungsten filament (color temperature between 2200-3000K) sources, ratio (multiple detectors for light scatter) and non-ratio (single detector for light scatter) detection methods and detector technologies such as attenuation and backscatter. In addition, three in-situ laboratories participated. These laboratories qualified under the scope of this method because they could analyze samples that were captured and held in a static mode.
Data analysis for precision and bias utilized ASTM E 691, Practice for Conducting an Interlaboratory Study to Determine the Precision of a Test Method. While only bias could be determined using surrogate samples (calibration standards that were treated as unknowns), precision could still be determined along with triplicate sample analysis.
The success of this round robin depended upon having as many representative technologies as possible that could deliver data across the array of samples. This was achieved through defining a “laboratory” as the combination of an operator and an instrument using a specific technology resulting in a particular reporting unit (e.g., white light attenuation (AU), nephelometric near-infrared non-ratiometric (FNU), nephelometric non-ratiometric (NTU)). Under this definition, if an instrument has two measurement technologies, two laboratories resulted. This description of a laboratory allowed for 21 laboratories from seven manufacturers to qualify in this round robin. There were nine participants with FNU laboratories, followed by four NTU laboratories, three nephelometric near-IR turbidimeter ratiometric (FNRU) laboratories, and two ratio white light (NTRU) laboratories. There were also single laboratories for the AU, formazin attenuation (FAU) and formazin backscatter (FBU) technologies. Each laboratory was assigned responsibilities for their instrument based on the manufacturers’ directions for setup, calibration, verification, preparation, running and recording of results. Data from all the laboratories was entered into a common Microsoft Excel spreadsheet for analysis using the E 691 protocol.
Sample Preparation and Measurements
The samples that were run in this round robin are summarized in Table 2. The ASTM high-level static turbidity method provides a stringent protocol for sample preparation and measurement, which was adhered to in this round robin. Samples were mixed in one large container using a churn splitter (operated by one designated individual), which continuously maintained homogeneous sample suspension during its dispensation to all laboratories. Each laboratory received three dispensations for triplicate analysis from the splitter. All samples were run immediately after preparation, one at a time, across all laboratories. The laboratory operator’s focus was on consistency in preparation and measurement practices.
To help determine if dilutions are linear and on what technologies they performed best, selected samples were serially diluted with deionized water, which is free of interferences, and measured over a defined time interval. This would allow for the determination of the impact that rapidly settling particles would have on the measurement over a defined period of time. Again, these two experiments would provide key information for the measurement of turbidity and show the power of the round robin to address the aforementioned key concerns that are typical in high-level turbidity measurement.
The triplicate measurements from each sample were averaged together. Then the data was averaged over any given technology (defined as the set containing data with the same reporting units), and was combined to generate the determined value and standard deviation of data collected using the respective technology; comparisons between the technologies were conducted. Different technology types from this round robin were compared to each other across commonly measured samples. Two examples of how samples can vary based upon the technology used to derive the turbidity and the effects of dilution of any given environmental sample are provided in Figures 1 and 2.
A USGS field sample from Kansas and the data derived from its dilutions can be found in Figure 1. The sample was high in clay content, which caused a reddish-brown color. This figure shows the derived turbidity from each technology. The original sample and two dilutions are shown. The concentrations of the original sample are 100 percent (back row), 25 percent (middle row) and 4.2 percent (front row) of the original sample.
A similar analysis can be found in Figure 2. This sample was from Alaskan glacial runoff high in finely divided solids that exhibited a dark grey color. The graph displays the concentrated sample at 100 percent (back row), a dilution to 50 percent (middle row) and a dilution to 25 percent (front row) of the original sample.
The Kansas and Alaska samples serve as examples of the typical deviations that were observed with many of the environmental samples. In general, deviations among some technologies were significant. For example, a 3 to 1 difference between like and different technologies were observed on some samples. In many cases, those laboratories that generated measurements that were of the greatest deviation from the intra-instrument mean were later found (unknown to the laboratory at the time of reading) to be outside of the recommended measurement range for a respective technology. The data showed that when measuring within the manufacturer’s recommendations, each meter gives a consistent reading for a given sample and its replicates. It is therefore very important that the analyst understand which technologies will produce valid measurements.
In addition to generation of the precision and bias results to meet required ASTM method criteria, the data was further examined to deliver a more quantitative comparison across technologies and samples. One goal of the round robin was to determine which technologies were outliers and which were in conformance with respect to the overall determined turbidity, across all technologies and all samples. To do this, three criteria were established. These are “outlier result,” “moderately applicable result” and “best applicable result.” The definitions for each of these terms are given below.2
• An “outlier result” is defined as a laboratory result that deviates by more than 25 percent from the averaged reading, derived across all technologies on a given sample.
• A “moderately applicable result” is defined as one that deviates between 10 and 25 percent from the average reading, derived across all technologies. A technology that delivered a result in this range from the mean produces a value that is in the ballpark of the mean, but does not have a high level of confidence.
• A “best applicable result” is defined as one that is within 10 percent of the average reading across all technologies. The technologies that produce this result compared most favorably to the mean value of the samples and are not outliers.
For each of 25 environmental samples, these criteria were applied to each technology. The values could then be plotted in percentages for all samples that were run in this round robin. Figure 3 provides an encompassing summary of which technologies were outliers and which most consistently tracked the mean value that was generated from all technologies.
In comparing all technologies across all samples, it was found that:
• Multiple-detector technologies (independent of light source) were the most applicable across sample type and range;
• The single detector technologies that were not at a 90° geometry (AU, FAU and FBU) were the least comparable;
w With respect to light sources, measurements were found to compare favorably in the ratio technologies and less favorably in the single detector technologies.
Turbidity standards that are typically used for the calibration of the instrumentation were run as unknowns in this round robin and defined as surrogates. These were used to confirm the operating ranges for the different technologies. These ranges were determined by a combination of manufacturer specifications and measurement accuracy on the surrogate standards. Within these ranges, the technologies were required to measure within 5 percent of the theoretical value of the surrogates tested. Table 3 provides a summary of the recommended ranges for each turbidimeter technology.
Many technologies extend outside the ranges that were derived from this round robin. The high-level static method will be modified to caution users to run dilution-based tests to determine if the technology can perform on their samples.
Variability Across Technologies
The real-world samples provided information that showed the variability of turbidity measurement as a function of turbidity. In Figure 4, two different comparisons were given. The comparisons in red reflect the intra-instrument comparability across all technologies prior to the removal of outlier technologies (± 25 percent of the mean). The green points reflect the intra-technology comparability after outlier removal. In nearly all cases, the outliers were restricted to those single detector technologies that detect at an angle other than at 90°.
In Figure 4, some points are marked with shadows or with an “x.” These are the USGS quality control samples, which contained known quantities of sands and were expected to exhibit significant variability due to rapid settling characteristics. In contrast to expectation, these samples did not show any increased variability when compared to the other samples. The figure shows that the removal of outlier technologies improved comparability to better than 20 percent at turbidities that are below 1,000 units. Above 1,000, increased variability across technologies was observed.
Sample Stability Versus Time
Certain laboratories in this study (labs 2, 16 and 20) were capable of measuring within one second after a sample was placed in the measurement chamber of the instrument. These technologies (in-situ probes with rapid process measuring software) were able to display the initial decay in reading (caused by the rapid settling) that was missed by the other laboratories (traditional benchtop or portable instruments that require a wait time before performing the measurement on the sample). Figure 5 provides a turbidity versus time plot for the measurement of a sample that was high in sand content, an example of a high-density, rapidly settling interference.
As shown in Figure 5, 14 different laboratories participated in a short study to determine the impact of rapidly settling particles on the measurement. Rapid measurement technologies displayed the early decay in measurement over time, as large particles such as coarse silts and sands rapidly settled. Thus, those instruments with a delayed measurement response missed the early decay and generated a negatively biased result. This result explains why the USGS quality control samples did not illustrate the expected variation in Figure 4.
This round robin and its participants generated data that will ultimately serve to answer many of the questions that emerge when performing high-level turbidity measurements. The round robin was successful in generating relevant data to assist in the writing of the respective turbidity method. The samples used in this study were broad and varied, reflecting the broad scope of applications that turbidity measurement serves.
Conducting the round robin at a central location provided the opportunity to successfully run all the intended experiments that ultimately generated valuable data that was reflective of measurement technologies, analytical techniques, sample collection, sample preparation and reporting.
This round robin went beyond answering the questions mentioned earlier in this paper. Following are some of the lessons gleaned from this study:
• Formazin and stabilized formazin standards compared well across most technologies. The styrenedivinylbenzene standards are designed for individual models of turbidity meters and need to be matched with the correct instrument type, or error will develop.
• Different technologies can produce significantly different results from identical environmental samples. When sample turbidities increase over 1,000, the variation among technologies becomes more significant.
• Duplicate samples agree quite well within technologies, that is, results are repeatable if using the same technology. This also implies that if techniques are consistent, precision will be highly improved.
• The ratio technologies fit most samples and can measure very low to very high levels of turbidity.
• Sample preparation and wait time to measurement can have an impact on data as rapid settling of particles can induce significant error. For rapid-settling samples, the first available measurement is likely the closest to the true turbidity.
The round robin confirmed what Subcommittee D19.07 intended when they developed the turbidity method with robust and stringent sample handling, preparation, measurement and reporting protocols. Ultimately, this new ASTM method will improve data quality in turbidity measurement if the following keys to successful measurement are applied:
• Understand the expected range of measurement prior to technology selection.
• Create traceable reporting units.
• Avoid diluting samples if possible, and thoroughly investigate if not.
• Practice consistency in sub-sample dispensation, preparation and measurement.
• Avoid measuring samples with rapidly settling materials using technologies that have slow response times.
The authors wish to express their gratitude to all those who participated in this round robin, provided samples and reviewed data. This includes the members of the ASTM D19 sub-committees that touch on turbidity measurement; the instrument manufacturers who supplied instruments, surrogate standards and personnel to participate in this study; the USGS field scientists who were instrumental in collecting many samples; and USGS laboratory personnel for the preparation of quality control samples and hosting the round robin study.
The combination of study protocol modifications; the assembly of a broad team of scientists, turbidity experts and manufacturers to a common location with a common purpose; and a wide range of samples reflecting much of the real-world turbidity challenges were keys to delivering a successful round robin study that will be useful for many years to come. //
1. United States Geological Survey (USGS), “National Field Manual for the Collection of Water Quality Data”. Website: http:////www.usgs.gov/owq/FieldManual/Chapter6/6.7_contents.html
2. Sadar, M. J., and Glysson, G. D. (2006), “The Analysis of Turbidity Data, Establishing the Link Between Sample Characteristics and Measurement Technologies.” 2006 National Water Quality Monitoring Conference; San Jose, CA.