EnvStats (version 2.3.1)

tolIntNormK: Compute the Value of \(K\) for a Tolerance Interval for a Normal Distribution

Description

Compute the value of \(K\) (the multiplier of estimated standard deviation) used to construct a tolerance interval based on data from a normal distribution.

Usage

tolIntNormK(n, df = n - 1, coverage = 0.95, cov.type = "content", 
    ti.type = "two-sided", conf.level = 0.95, method = "exact", 
    rel.tol = 1e-07, abs.tol = rel.tol)

Arguments

n

a positive integer greater than 2 indicating the sample size upon which the tolerance interval is based.

df

the degrees of freedom associated with the tolerance interval. The default is df=n-1.

coverage

a scalar between 0 and 1 indicating the desired coverage of the tolerance interval. The default value is coverage=0.95.

cov.type

character string specifying the coverage type for the tolerance interval. The possible values are "content" (\(\beta\)-content; the default), and "expectation" (\(\beta\)-expectation). See the help file for tolIntNorm for more information on the difference between \(\beta\)-content and \(\beta\)-expectation tolerance intervals.

ti.type

character string indicating what kind of tolerance interval to compute. The possible values are "two-sided" (the default), "lower", and "upper".

conf.level

a scalar between 0 and 1 indicating the confidence level associated with the tolerance interval. The default value is conf.level=0.95.

method

for the case of a two-sided tolerance interval, a character string specifying the method for constructing the tolerance interval. This argument is ignored if ti.type="lower" or ti.type="upper". The possible values are "exact" (the default) and "wald.wolfowitz" (the Wald-Wolfowitz approximation). See the DETAILS section for more information.

rel.tol

in the case when ti.type="two-sided" and method="exact", the argument rel.tol is passed to the function integrate. The default value is rel.tol=1e-07.

abs.tol

in the case when ti.type="two-sided" and method="exact", the argument abs.tol is passed to the function integrate. The default value is the value of rel.tol.

Value

The value of \(K\), a numeric scalar used to construct tolerance intervals for a normal (Gaussian) distribution.

Details

A tolerance interval for some population is an interval on the real line constructed so as to contain \(100 \beta \%\) of the population (i.e., \(100 \beta \%\) of all future observations), where \(0 < \beta < 1\). The quantity \(100 \beta \%\) is called the coverage.

There are two kinds of tolerance intervals (Guttman, 1970):

  • A \(\beta\)-content tolerance interval with confidence level \(100(1-\alpha)\%\) is constructed so that it contains at least \(100 \beta \%\) of the population (i.e., the coverage is at least \(100 \beta \%\)) with probability \(100(1-\alpha)\%\), where \(0 < \alpha < 1\). The quantity \(100(1-\alpha)\%\) is called the confidence level or confidence coefficient associated with the tolerance interval.

  • A \(\beta\)-expectation tolerance interval is constructed so that the average coverage of the interval is \(100 \beta \%\).

Note: A \(\beta\)-expectation tolerance interval with coverage \(100 \beta \%\) is equivalent to a prediction interval for one future observation with associated confidence level \(100 \beta \%\). Note that there is no explicit confidence level associated with a \(\beta\)-expectation tolerance interval. If a \(\beta\)-expectation tolerance interval is treated as a \(\beta\)-content tolerance interval, the confidence level associated with this tolerance interval is usually around 50% (e.g., Guttman, 1970, Table 4.2, p.76).

For a normal distribution, the form of a two-sided \(100(1-\alpha)\%\) tolerance interval is: $$[\bar{x} - Ks, \, \bar{x} + Ks]$$ where \(\bar{x}\) denotes the sample mean, \(s\) denotes the sample standard deviation, and \(K\) denotes a constant that depends on the sample size \(n\), the coverage, and, for a \(\beta\)-content tolerance interval (but not a \(\beta\)-expectation tolerance interval), the confidence level.

Similarly, the form of a one-sided lower tolerance interval is: $$[\bar{x} - Ks, \, \infty]$$ and the form of a one-sided upper tolerance interval is: $$[-\infty, \, \bar{x} + Ks]$$ but \(K\) differs for one-sided versus two-sided tolerance intervals.

The Derivation of \(K\) for a \(\beta\)-Content Tolerance Interval

One-Sided Case

When ti.type="upper" or ti.type="lower", the constant \(K\) for a \(100 \beta \%\) \(\beta\)-content tolerance interval with associated confidence level \(100(1 - \alpha)\%\) is given by: $$K = t(n-1, 1 - \alpha, z_\beta \sqrt{n}) / \sqrt{n}$$ where \(t(\nu, p, \delta)\) denotes the \(p\)'th quantile of a non-central t-distribution with \(\nu\) degrees of freedom and noncentrality parameter \(\delta\) (see the help file for TDist), and \(z_p\) denotes the \(p\)'th quantile of a standard normal distribution.

Two-Sided Case

When ti.type="two-sided" and method="exact", the exact formula for the constant \(K\) for a \(100 \beta \%\) \(\beta\)-content tolerance interval with associated confidence level \(100(1-\alpha)\%\) requires numerical integration and has been derived by several different authors, including Odeh (1978), Eberhardt et al. (1989), Jilek (1988), Fujino (1989), and Janiga and Miklos (2001). Specifically, for given values of the sample size \(n\), degrees of freedom \(\nu\), confidence level \((1-\alpha)\), and coverage \(\beta\), the constant \(K\) is the solution to the equation: $$\sqrt{\frac{n}{2 \pi}} \, \int^\infty_{-\infty} {F(x, K, \nu, R) \, e^{(-nx^2)/2}} \, dx = 1 - \alpha$$ where \(F(x, K, \nu, R)\) denotes the upper-tail area from \((\nu \, R^2) / K^2\) to \(\infty\) of the chi-squared distribution with \(\nu\) degrees of freedom, and \(R\) is the solution to the equation: $$\Phi (x + R) - \Phi (x - R) = \beta$$ where \(\Phi()\) denotes the standard normal cumulative distribuiton function.

When ti.type="two-sided" and method="wald.wolfowitz", the approximate formula due to Wald and Wolfowitz (1946) for the constant \(K\) for a \(100 \beta \%\) \(\beta\)-content tolerance interval with associated confidence level \(100(1-\alpha)\%\) is given by: $$K \approx r \, u$$ where \(r\) is the solution to the equation: $$\Phi (\frac{1}{\sqrt{n}} + r) - \Phi (\frac{1}{\sqrt{n}} - r) = \beta$$ \(\Phi ()\) denotes the standard normal cumulative distribuiton function, and \(u\) is given by: $$u = \sqrt{\frac{n-1}{\chi^{2} (n-1, \alpha)}}$$ where \(\chi^{2} (\nu, p)\) denotes the \(p\)'th quantile of the chi-squared distribution with \(\nu\) degrees of freedom.

The Derivation of \(K\) for a \(\beta\)-Expectation Tolerance Interval

As stated above, a \(\beta\)-expectation tolerance interval with coverage \(100 \beta \%\) is equivalent to a prediction interval for one future observation with associated confidence level \(100 \beta \%\). This is because the probability that any single future observation will fall into this interval is \(100 \beta \%\), so the distribution of the number of \(N\) future observations that will fall into this interval is binomial with parameters size = \(N\) and prob = \(\beta\) (see the help file for Binomial). Hence the expected proportion of future observations that will fall into this interval is \(100 \beta \%\) and is independent of the value of \(N\). See the help file for predIntNormK for information on how to derive \(K\) for these intervals.

References

Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers. Lewis Publishers, Boca Raton.

Draper, N., and H. Smith. (1998). Applied Regression Analysis. Third Edition. John Wiley and Sons, New York.

Eberhardt, K.R., R.W. Mee, and C.P. Reeve. (1989). Computing Factors for Exact Two-Sided Tolerance Limits for a Normal Distribution. Communications in Statistics, Part B-Simulation and Computation 18, 397-413.

Ellison, B.E. (1964). On Two-Sided Tolerance Intervals for a Normal Distribution. Annals of Mathematical Statistics 35, 762-772.

Fujino, T. (1989). Exact Two-Sided Tolerance Limits for a Normal Distribution. Japanese Journal of Applied Statistics 18, 29-36.

Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.

Gilbert, R.O. (1987). Statistical Methods for Environmental Pollution Monitoring. Van Nostrand Reinhold, New York.

Guttman, I. (1970). Statistical Tolerance Regions: Classical and Bayesian. Hafner Publishing Co., Darien, CT.

Hahn, G.J. (1970b). Statistical Intervals for a Normal Population, Part I: Tables, Examples and Applications. Journal of Quality Technology 2(3), 115-125.

Hahn, G.J. (1970c). Statistical Intervals for a Normal Population, Part II: Formulas, Assumptions, Some Derivations. Journal of Quality Technology 2(4), 195-206.

Hahn, G.J., and W.Q. Meeker. (1991). Statistical Intervals: A Guide for Practitioners. John Wiley and Sons, New York.

Jilek, M. (1988). Statisticke Tolerancni Meze. SNTL, Praha.

Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.

Janiga, I., and R. Miklos. (2001). Statistical Tolerance Intervals for a Normal Distribution. Measurement Science Review 11, 29-32.

Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton.

Odeh, R.E. (1978). Tables of Two-Sided Tolerance Factors for a Normal Distribution. Communications in Statistics, Part B-Simulation and Computation 7, 183-201.

Odeh, R.E., and D.B. Owen. (1980). Tables for Normal Tolerance Limits, Sampling Plans, and Screening. Marcel Dekker, New York.

Owen, D.B. (1962). Handbook of Statistical Tables. Addison-Wesley, Reading, MA.

Singh, A., R. Maichle, and N. Armbya. (2010a). ProUCL Version 4.1.00 User Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

Singh, A., N. Armbya, and A. Singh. (2010b). ProUCL Version 4.1.00 Technical Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.

USEPA. (2010). Errata Sheet - March 2009 Unified Guidance. EPA 530/R-09-007a, August 9, 2010. Office of Resource Conservation and Recovery, Program Information and Implementation Division. U.S. Environmental Protection Agency, Washington, D.C.

Wald, A., and J. Wolfowitz. (1946). Tolerance Limits for a Normal Distribution. Annals of Mathematical Statistics 17, 208-215.

See Also

tolIntNorm, predIntNorm, Normal, estimate.object, enorm, eqnorm, Tolerance Intervals, Prediction Intervals, Estimating Distribution Parameters, Estimating Distribution Quantiles.

Examples

Run this code
# NOT RUN {
  # Compute the value of K for a two-sided 95% beta-content 
  # tolerance interval with associated confidence level 95% 
  # given a sample size of n=20.

  #----------
  # Exact method

  tolIntNormK(n = 20)
  #[1] 2.760346

  #----------
  # Approximate method due to Wald and Wolfowitz (1946)

  tolIntNormK(n = 20, method = "wald")
  # [1] 2.751789


  #--------------------------------------------------------------------

  # Compute the value of K for a one-sided upper tolerance limit 
  # with 99% coverage and associated confidence level 90% 
  # given a samle size of n=20.

  tolIntNormK(n = 20, ti.type = "upper", coverage = 0.99, 
    conf.level = 0.9)
  #[1] 3.051543

  #--------------------------------------------------------------------

  # Example 17-3 of USEPA (2009, p. 17-17) shows how to construct a 
  # beta-content upper tolerance limit with 95% coverage and 95% 
  # confidence  using chrysene data and assuming a lognormal 
  # distribution.  The sample size is n = 8 observations from 
  # the two compliance wells.  Here we will compute the 
  # multiplier for the log-transformed data.

  tolIntNormK(n = 8, ti.type = "upper")
  #[1] 3.187294
# }

Run the code above in your browser using DataLab