Last chance! 50% off unlimited learning
Sale ends in
Compute (approximate) probabilities for the non-central chi-squared distribution.
The non-central chi-squared distribution with df
ncp
pchisq
.
R's own historical and current versions, but with more tuning parameters;
Historical relatively simple approximations listed in Johnson, Kotz, and Balakrishnan (1995):
Patnaik(1949)'s approximation to the non-central via central
chi-squared. Is also the formula
Pearson(1959) is using 3 moments instead of 2 as Patnaik (to
approximate via a central chi-squared), and therefore better than
Patnaik for the right tail; further (in Johnson et al.), the
approximation error is
Abdel-Aty(1954)'s “first approximation” based on
Wilson-Hilferty via Gaussian (pnorm
) probabilities, is
partly wrongly cited in Johnson et al., p.463, eq.
Bol'shev and Kuznetzov (1963) concentrate on the case of
small ncp
df
,
but a modified q
(‘x’); the approximation has error
Sankaran(1959, 1963) proposes several further approximations base
on Gaussian probabilities, according to Johnson
et al., p.463. pnchisqSankaran_d()
implements its formula
pnchisq()
:an R implementation of R's own C pnchisq_raw()
,
but almost only up to Feb.27, 2004, long before the log.p=TRUE
addition there, including logspace arithmetic in April 2014,
its finish on 2015-09-01. Currently for historical reference only.
%% but we could add the log space computations from C to the R code
pnchisqV()
:
pnchisqRC()
:R's C implementation as of Aug.2019; but with many more options. Currently extreme cases tend to hang on Winbuilder (?)
pnchisqIT
:....
pnchisqTerms
:....
%
pnchisqT93
:pure R implementations of approximations when
both q
and ncp
are large, by Temme(1993), from Johnson
et al., p.467, formulas pnchisqT93a()
and pnchisqT93b()
respectively, with adapted formulas for the log.p=TRUE
cases.
pnchisq_ss()
:....
%
ss
:....
ss2
:....
ss2.
:....
pnchisq (q, df, ncp = 0, lower.tail = TRUE,
cutOffncp = 80, itSimple = 110, errmax = 1e-12, reltol = 1e-11,
maxit = 10* 10000, verbose = 0, xLrg.sigma = 5)
pnchisqV(x, ..., verbose = 0)pnchisqRC (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE,
no2nd.call = FALSE,
cutOffncp = 80, small.ncp.logspace = small.ncp.logspaceR2015,
itSimple = 110, errmax = 1e-12,
reltol = 8 * .Machine$double.eps, epsS = reltol/2, maxit = 1e6,
verbose = FALSE)
pnchisqAbdelAty (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqBolKuz (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqPatnaik (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqPearson (q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisqSankaran_d(q, df, ncp = 0, lower.tail = TRUE, log.p = FALSE)
pnchisq_ss (x, df, ncp = 0, lower.tail = TRUE, log.p = FALSE, i.max = 10000)
pnchisqTerms (x, df, ncp, lower.tail = TRUE, i.max = 1000)
pnchisqT93 (q, df, ncp, lower.tail = TRUE, log.p = FALSE, use.a = q > ncp)
pnchisqT93.a(q, df, ncp, lower.tail = TRUE, log.p = FALSE)
pnchisqT93.b(q, df, ncp, lower.tail = TRUE, log.p = FALSE)
ss (x, df, ncp, i.max = 10000, useLv = !(expMin < -lambda && 1/lambda < expMax))
ss2 (x, df, ncp, i.max = 10000, eps = .Machine$double.eps)
ss2. (q, df, ncp = 0, errmax = 1e-12, reltol = 2 * .Machine$double.eps,
maxit = 1e+05, eps = reltol, verbose = FALSE)
ss()
returns a list with 3 components
the series
location (in s[]
) of the first change from 0 to positive.
(first) location of the maximal value in the series (i.e.,
which.max(s)
).
numeric vector (of ‘quantiles’, i.e., abscissa values).
number ( ‘quantile’, i.e., abscissa value.)
degrees of freedom
non-centrality parameter
logical, see, e.g., pchisq()
.
number of terms in evaluation ...
logical
vector for Temme pnchisqT93*()
formulas, indicating to use formula ‘a’ over ‘b’. The
default is as recommended in the references, but they did not take into
account log.p = TRUE
situations.
a positive number, the cutoff value for ncp
...
...
absolute error tolerance.
convergence tolerance for relative error.
maximal number of iterations.
positive number ...
logical indicating if a 2nd call is made to the internal function ....
logical vector or function
,
indicating if the logspace computations for “small” ncp
(defined to fulfill ncp < cutOffncp
!).
small positive number, the convergence tolerance of the ‘simple’ iterations...
logical or integer specifying if or how much the algorithm progress should be monitored.
further arguments passed from pnchisqV()
to pnchisq()
.
logical
indicating if logarithmic scale should
be used for
convergence tolerance, a positive number.
Martin Maechler, from May 1999; starting from a post to the S-news
mailing list by Ranjan Maitra (@ math.umbc.edu) who showed a version of
our pchisqAppr.0()
thanking Jim Stapleton for providing it.
pnchisq_ss()
uses si <- ss(x, df, ..)
to get the series terms,
and returns 2*dchisq(x, df = df +2) * sum(si$s)
.
ss()
computes the terms needed for the expansion used in
pnchisq_ss()
.
ss2()
computes some simple “statistics” about ss(..)
.
%% do we need it? currently it computes ss() and trashes most of it ..
Johnson, N.L., Kotz, S. and Balakrishnan, N. (1995)
Continuous Univariate Distributions Vol 2, 2nd ed.; Wiley;
chapter 29 Noncentral
Abramowitz, M. and Stegun, I. A. (1972) Handbook of Mathematical Functions. New York: Dover. https://en.wikipedia.org/wiki/Abramowitz_and_Stegun
pchisq
and the wienergerm approximations for it:
pchisqW()
etc.
r_pois()
and its plot function, for an aspect of the series
approximations we use in pnchisq_ss()
.
## set of quantiles to use :
qq <- c(.001, .005, .01, .05, (1:9)/10, 2^seq(0, 10, by= 0.5))
## Take "all interesting" pchisq-approximation from our pkg :
pkg <- "package:DPQ"
pnchNms <- c(paste0("pchisq", c("V", "W", "W.", "W.R")),
ls(pkg, pattern = "^pnchisq"))
pnchNms <- pnchNms[!grepl("Terms$", pnchNms)]
pnchF <- sapply(pnchNms, get, envir = as.environment(pkg))
str(pnchF)
ncps <- c(0, 1/8, 1/2)
pnchR <- as.list(setNames(ncps, paste("ncp",ncps, sep="=")))
for(i.n in seq_along(ncps)) {
ncp <- ncps[i.n]
pnF <- if(ncp == 0) pnchF[!grepl("chisqT93", pnchNms)] else pnchF
pnchR[[i.n]] <- sapply(pnF, function(F)
Vectorize(F, names(formals(F))[[1]])(qq, df = 3, ncp=ncp))
}
str(pnchR, max=2)
## A case where the non-central P[] should be improved :
## First, the central P[] which is close to exact -- choosing df=2 allows
## truly exact values: chi^2 = Exp(1) !
opal <- palette()
palette(c("black", "red", "green3", "blue", "cyan", "magenta", "gold3", "gray44"))
cR <- curve(pchisq (x, df=2, lower.tail=FALSE, log.p=TRUE), 0, 4000, n=2001)
cRC <- curve(pnchisqRC(x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
add=TRUE, col=adjustcolor(2,1/2), lwd=3, lty=2, n=2001)
cR0 <- curve(pchisq (x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
add=TRUE, col=adjustcolor(3,1/2), lwd=4, n=2001)
## smart "named list" constructur :
list_ <- function(...)
`names<-`(list(...), vapply(sys.call()[-1L], as.character, ""))
JKBfn <-list_(pnchisqPatnaik,
pnchisqPearson,
pnchisqAbdelAty,
pnchisqBolKuz,
pnchisqSankaran_d)
cl. <- setNames(adjustcolor(3+seq_along(JKBfn), 1/2), names(JKBfn))
lw. <- setNames(2+seq_along(JKBfn), names(JKBfn))
cR.JKB <- sapply(names(JKBfn), function(nmf) {
curve(JKBfn[[nmf]](x, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE),
add=TRUE, col=cl.[[nmf]], lwd=lw.[[nmf]], lty=lw.[[nmf]], n=2001)
})
legend("bottomleft", c("pchisq", "pchisq.ncp=0", "pnchisqRC", names(JKBfn)),
col=c(palette()[1], adjustcolor(2:3,1/2), cl.),
lwd=c(1,3,4, lw.), lty=c(1,2,1, lw.))
palette(opal)# revert
all.equal(cRC, cR0, tol = 1e-15) # TRUE [for now]
## the problematic "jump" :
as.data.frame(cRC)[744:750,]
if(.Platform$OS.type == "unix")
## verbose=TRUE may reveal which branches of the algorithm are taken:
pnchisqRC(1500, df=2, ncp=0, lower.tail=FALSE, log.p=TRUE, verbose=TRUE) #
## |--> -Inf currently
## The *two* principal cases (both lower.tail = {TRUE,FALSE} !), where
## "2nd call" happens *and* is currently beneficial :
dfs <- c(1:2, 5, 10, 20)
pL. <- pnchisqRC(.00001, df=dfs, ncp=0, log.p=TRUE, lower.tail=FALSE, verbose = TRUE)
pR. <- pnchisqRC( 100, df=dfs, ncp=0, log.p=TRUE, verbose = TRUE)
## R's own non-central version (specifying 'ncp'):
pL0 <- pchisq (.00001, df=dfs, ncp=0, log.p=TRUE, lower.tail=FALSE)
pR0 <- pchisq ( 100, df=dfs, ncp=0, log.p=TRUE)
## R's *central* version, i.e., *not* specifying 'ncp' :
pL <- pchisq (.00001, df=dfs, log.p=TRUE, lower.tail=FALSE)
pR <- pchisq ( 100, df=dfs, log.p=TRUE)
cbind(pL., pL, relEc = signif(1-pL./pL, 3), relE0 = signif(1-pL./pL0, 3))
cbind(pR., pR, relEc = signif(1-pR./pR, 3), relE0 = signif(1-pR./pR0, 3))
Run the code above in your browser using DataLab