pnchisqAppr: (Approximate) Probabilities of Non-Central Chi-squared Distribution

Description

Compute (approximate) probabilities for the non-central chi-squared distribution.

The non-central chi-squared distribution with df $= n$ degrees of freedom and non-centrality parameter ncp $= λ$ has density $f (x) = f_{n, λ} (x) = e^{- λ / 2} \sum_{r = 0}^{\infty} \frac{(λ / 2)^{r}}{r!} f_{n + 2 r} (x)$ for $x \geq 0$ ; for more, see R's help page for pchisq.

R's own historical and current versions, but with more tuning parameters;

Historical relatively simple approximations listed in Johnson, Kotz, and Balakrishnan (1995):

Patnaik(1949)'s approximation to the non-central via central chi-squared. Is also the formula $26.4 .27$ in Abramowitz & Stegun, p.942. Johnson et al mention that the approximation error is $O (1 / \sqrt{(} λ))$ for $λ \to \infty$ .
Pearson(1959) is using 3 moments instead of 2 as Patnaik (to approximate via a central chi-squared), and therefore better than Patnaik for the right tail; further (in Johnson et al.), the approximation error is $O (1 / λ)$ for $λ \to \infty$ .
Abdel-Aty(1954)'s “first approximation” based on Wilson-Hilferty via Gaussian (pnorm) probabilities, is partly wrongly cited in Johnson et al., p.463, eq. $(29.61 a)$ .
Bol'shev and Kuznetzov (1963) concentrate on the case of small ncp $λ$ and provide an “approximation” via central chi-squared with the same degrees of freedom df, but a modified q (‘x’); the approximation has error $O (λ^{3})$ for $λ \to 0$ and is from Johnson et al., p.465, eq. $(29.62)$ and $(29.63)$ .
Sankaran(1959, 1963) proposes several further approximations base on Gaussian probabilities, according to Johnson et al., p.463. pnchisqSankaran_d() implements its formula $(29.61 d)$ .

pnchisq():: an R implementation of R's own C pnchisq_raw(), but almost only up to Feb.27, 2004, long before the log.p=TRUE addition there, including logspace arithmetic in April 2014, its finish on 2015-09-01. Currently for historical reference only.
pnchisqV():: a Vectorize()d pnchisq.
pnchisqRC():: R's C implementation as of Aug.2019; but with many more options. Currently extreme cases tend to hang on Winbuilder (?)
pnchisqIT:: ....
pnchisqTerms:: ....
pnchisqT93:: pure R implementations of approximations when both q and ncp are large, by Temme(1993), from Johnson et al., p.467, formulas $(29.71 a)$ , and $(29.71 b)$ , using auxiliary functions pnchisqT93a() and pnchisqT93b() respectively, with adapted formulas for the log.p=TRUE cases.
pnchisq_ss():: ....
ss:: ....
ss2:: ....
ss2.:: ....

Usage



pnchisq          (q, df, ncp = 0, lower.tail = TRUE, 
                  cutOffncp = 80, itSimple = 110, errmax = 1e-12, reltol = 1e-11,
                  maxit = 10* 10000, verbose = 0, xLrg.sigma = 5)
pnchisqV(x, ..., verbose = 0)
pnchisqRC        (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE,
                  no2nd.call = FALSE,
                  cutOffncp = 80, small.ncp.logspace = small.ncp.logspaceR2015,
                  itSimple = 110, errmax = 1e-12,
                  reltol = 8 * .Machine$double.eps, epsS = reltol/2, maxit = 1e6,
                  verbose = FALSE)
pnchisqAbdelAty  (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqBolKuz    (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqPatnaik   (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqPearson   (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqSankaran_d(q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisq_ss       (x, df, ncp = 0, lower.tail = TRUE, log.p = FALSE, i.max = 10000)
pnchisqTerms     (x, df, ncp,     lower.tail = TRUE, i.max = 1000)
pnchisqT93  (q, df, ncp, lower.tail = TRUE, log.p = FALSE, use.a = q > ncp)
pnchisqT93.a(q, df, ncp, lower.tail = TRUE, log.p = FALSE)
pnchisqT93.b(q, df, ncp, lower.tail = TRUE, log.p = FALSE)
ss   (x, df, ncp, i.max = 10000, useLv = !(expMin < -lambda && 1/lambda < expMax))
ss2  (x, df, ncp, i.max = 10000, eps = .Machine$double.eps)
ss2. (q, df, ncp = 0, errmax = 1e-12, reltol = 2 * .Machine$double.eps,
      maxit = 1e+05, eps = reltol, verbose = FALSE)

Value

ss(): returns a list with 3 components

s: the series
i1: location (in s[]) of the first change from 0 to positive.
max: (first) location of the maximal value in the series (i.e., which.max(s)).

Arguments

x: numeric vector (of ‘quantiles’, i.e., abscissa values).
q: number ( ‘quantile’, i.e., abscissa value.)
df: degrees of freedom $> 0$ , maybe non-integer.
ncp: non-centrality parameter $δ$ ; ....
lower.tail, log.p: logical, see, e.g., pchisq().

i.max: number of terms in evaluation ...
use.a: logical vector for Temme pnchisqT93*() formulas, indicating to use formula ‘a’ over ‘b’. The default is as recommended in the references, but they did not take into account log.p = TRUE situations.
cutOffncp: a positive number, the cutoff value for ncp...

itSimple: ...
errmax: absolute error tolerance.

reltol: convergence tolerance for relative error.

maxit: maximal number of iterations.

xLrg.sigma: positive number ...

no2nd.call: logical indicating if a 2nd call is made to the internal function ....

small.ncp.logspace: logical vector or function, indicating if the logspace computations for “small” ncp (defined to fulfill ncp < cutOffncp !).
epsS: small positive number, the convergence tolerance of the ‘simple’ iterations...

verbose: logical or integer specifying if or how much the algorithm progress should be monitored.
...: further arguments passed from pnchisqV() to pnchisq().
useLv: logical indicating if logarithmic scale should be used for $λ$ computations.
eps: convergence tolerance, a positive number.

Author

Martin Maechler, from May 1999; starting from a post to the S-news mailing list by Ranjan Maitra (@ math.umbc.edu) who showed a version of our pchisqAppr.0() thanking Jim Stapleton for providing it.

Details

pnchisq_ss(): uses si <- ss(x, df, ..) to get the series terms, and returns 2*dchisq(x, df = df +2) * sum(si$s).
ss(): computes the terms needed for the expansion used in pnchisq_ss().
ss2(): computes some simple “statistics” about ss(..).

References

Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995) Continuous Univariate Distributions Vol 2, 2nd ed.; Wiley; chapter 29 Noncentral $χ^{2}$ -Distributions; notably Section 8 Approximations, p.461 ff.

Abramowitz, M. and Stegun, I. A. (1972) Handbook of Mathematical Functions. New York: Dover. https://en.wikipedia.org/wiki/Abramowitz_and_Stegun

Examples

Run this code

## set of quantiles to use :
qq <- c(.001, .005, .01, .05, (1:9)/10, 2^seq(0, 10, by= 0.5))
## Take "all interesting" pchisq-approximation from our pkg :
pkg <- "package:DPQ"
pnchNms <- c(paste0("pchisq", c("V", "W", "W.", "W.R")),
             ls(pkg, pattern = "^pnchisq"))
pnchNms <- pnchNms[!grepl("Terms$", pnchNms)]
pnchF <- sapply(pnchNms, get, envir = as.environment(pkg))
str(pnchF)
ncps <- c(0, 1/8, 1/2)
pnchR <- as.list(setNames(ncps, paste("ncp",ncps, sep="=")))
for(i.n in seq_along(ncps)) {
  ncp <- ncps[i.n]
  pnF <- if(ncp == 0) pnchF[!grepl("chisqT93", pnchNms)] else pnchF
  pnchR[[i.n]] <- sapply(pnF, function(F)
            Vectorize(F, names(formals(F))[[1]])(qq, df = 3, ncp=ncp))
}
str(pnchR, max=2)
		 

## A case where the non-central P[] should be improved :
## First, the central P[] which is close to exact -- choosing df=2 allows
## truly exact values: chi^2 = Exp(1) !
opal <- palette()
palette(c("black", "red", "green3", "blue", "cyan", "magenta", "gold3", "gray44"))
cR  <- curve(pchisq   (x, df=2,        lower.tail=FALSE, log.p=TRUE), 0, 4000, n=2001)
cRC <- curve(pnchisqRC(x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
             add=TRUE, col=adjustcolor(2,1/2), lwd=3, lty=2, n=2001)
cR0 <- curve(pchisq   (x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
             add=TRUE, col=adjustcolor(3,1/2), lwd=4,        n=2001)
## smart "named list" constructur :
list_ <- function(...)
   `names<-`(list(...), vapply(sys.call()[-1L], as.character, ""))
JKBfn <-list_(pnchisqPatnaik,
              pnchisqPearson,
              pnchisqAbdelAty,
              pnchisqBolKuz,
              pnchisqSankaran_d)
cl. <- setNames(adjustcolor(3+seq_along(JKBfn), 1/2), names(JKBfn))
lw. <- setNames(2+seq_along(JKBfn),                   names(JKBfn))
cR.JKB <- sapply(names(JKBfn), function(nmf) {
  curve(JKBfn[[nmf]](x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
        add=TRUE, col=cl.[[nmf]], lwd=lw.[[nmf]], lty=lw.[[nmf]], n=2001)
})
legend("bottomleft", c("pchisq", "pchisq.ncp=0", "pnchisqRC", names(JKBfn)),
       col=c(palette()[1], adjustcolor(2:3,1/2), cl.),
       lwd=c(1,3,4, lw.), lty=c(1,2,1, lw.))
palette(opal)# revert

all.equal(cRC, cR0, tol = 1e-15) # TRUE [for now]
## the problematic "jump" :
as.data.frame(cRC)[744:750,]
if(.Platform$OS.type == "unix")
  ## verbose=TRUE  may reveal which branches of the algorithm are taken:
  pnchisqRC(1500, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE, verbose=TRUE) #
  ## |-->  -Inf currently

## The *two*  principal cases (both lower.tail = {TRUE,FALSE} !), where
##  "2nd call"  happens *and* is currently beneficial :
dfs <- c(1:2, 5, 10, 20)
pL. <- pnchisqRC(.00001, df=dfs, ncp=0, log.p=TRUE, lower.tail=FALSE, verbose = TRUE)
pR. <- pnchisqRC(   100, df=dfs, ncp=0, log.p=TRUE,                   verbose = TRUE)
## R's own non-central version (specifying 'ncp'):
pL0 <- pchisq   (.00001, df=dfs, ncp=0, log.p=TRUE, lower.tail=FALSE)
pR0 <- pchisq   (   100, df=dfs, ncp=0, log.p=TRUE)
## R's *central* version, i.e., *not* specifying 'ncp' :
pL  <- pchisq   (.00001, df=dfs,        log.p=TRUE, lower.tail=FALSE)
pR  <- pchisq   (   100, df=dfs,        log.p=TRUE)
cbind(pL., pL, relEc = signif(1-pL./pL, 3), relE0 = signif(1-pL./pL0, 3))
cbind(pR., pR, relEc = signif(1-pR./pR, 3), relE0 = signif(1-pR./pR0, 3))

Run the code above in your browser using DataLab

Last chance! 50% off unlimited learning