dgamma-utils: Utility Functions for `dgamma()` -- Pure R Versions

Description

Mostly, pure R transcriptions of the C code utility functions for dgamma() and similar “base” density functions by Catherine Loader.

bd0C() interfaces to C code which corresponds to R's C Mathlib (Rmath) bd0().

These have extra arguments with defaults that correspond to R's Mathlib C code hardwired cutoffs and tolerances.

Usage


dpois_raw(x, lambda, log=FALSE,
          version, 
          small.x__lambda = .Machine$double.eps,
          ## the defaults for version will probably change in the future
          bd0.delta = 0.1,
          ## optional arguments of log1pmx() :
          tol_logcf = 1e-14, eps2 = 0.01, minL1 = -0.79149064, trace.lcf = verbose,
          logCF = if (is.numeric(x)) logcf else logcfR,
          verbose = FALSE)
dpois_simpl (x, lambda, log=FALSE)
dpois_simpl0(x, lambda, log=FALSE)
bd0(x, np,
    delta = 0.1, maxit = as.integer(-1100 / log2(delta)),
    s0 = .Machine$double.xmin,
    verbose = getOption("verbose"))
bd0C(x, np, delta = 0.1, maxit = 1000L, version = "R4.0", verbose = getOption("verbose"))
# "simple" log1pmx() based versions :
bd0_p1l1d1(x, M, tol_logcf = 1e-14, ...)
bd0_p1l1d (x, M, tol_logcf = 1e-14, ...)
bd0_l1pm  (x, M, tol_logcf = 1e-14, ...)
ebd0 (x, M, verbose = getOption("verbose"), ...) # experimental, may disappear !!
ebd0C(x, M, verbose = getOption("verbose"))
stirlerr(n, scheme = c("R3", "R4.x"),
         cutoffs = switch(scheme
                        , R3   = c(15, 35, 80, 500)
                        , R4.x = c(7.5, 8.5, 10.625, 12.125, 20, 26, 55, 200, 3300)
                          ),
         use.halves = missing(cutoffs),
         verbose = FALSE)
stirlerr_simpl(n, minPrec = 128L)
lgammacor(x, nalgm = 5, xbig = 2^26.5)

Value

a numeric vector “like”

x; in some cases may also be an (high accuracy) "mpfr"-number vector, using CRAN package Rmpfr.

ebd0() (R code) and ebd0C() (interface to C

code) are experimental, meant to be precision-extended version of

bd0(), returning (yh, yl) (high- and low-part of y, the numeric result). In order to work for long vectors x,

yh, yl need to be list components; hence we return a two-column data.frame with column names "yh" and

"yl".

lgammacor(x) originally returned NaN for all $|x| < 10$, as its Chebyshev polynomial approximation has been constructed for

$x \in [10, xbig]$, specifically for $u \in [-1,1]$ where

$t := 10/x \in [1/x_B, 1]$ and

$u := 2t^2 -1 \in [-1 + \epsilon_B, 1]$.

Arguments

x, n: numeric (or number-alike such as "mpfr").
lambda, np, M: each numeric (or number-alike ..); distribution parameters.
log: logical indicating if the log-density should be returned, otherwise the density at x.
verbose: logical indicating if some information about the computations are to be printed.
small.x__lambda: positive number; for dpois_raw(x, lambda), when x/lambda is not larger than small.x__lambda, the direct log poisson formula is used instead of ebd0(), bd0() or stirlerr().
delta, bd0.delta: a non-negative number $< 1$ (practically required to be $\le .99$), a cutoff for bd0() where a continued fraction series expansion is used when $|x - M| < delta*(x+M)$.
tol_logcf, eps2, minL1, trace.lcf, logCF, ...: optional tuning arguments passed to log1pmx(), and to its options passed to logcf().
maxit: the number of series expansion terms to be used in bd0() when $|x-M|$ is small. The default is $k$ such that $\delta^{2k} \le 2^{-1022-52}$, i.e., will underflow to zero.
s0: the very small $s_0$ determining that bd0() = s already before the locf series expansion.
version: a character string specifying the version of $bd0()$ used.
scheme: a character string specifying the cutoffs scheme for stirlerr().
cutoffs: an increasing numeric vector, required to start with with cutoffs[1] <= 15 specifying the cutoffs to switch from 2 to 3 to ..., up to 10 term approximations for non-small n, where the direct formula loses precision. When missing (as by default), scheme is used, where scheme = "R3" chooses (15, 35, 80, 500), the cutoffs in use in R versions up to (and including) 4.0.z.
use.halves: logical indicating if the full-accuracy prestored values should be use when $2n \in \{0,1,\dots,30\}$, i.e., $n \le 15$ and n is integer or integer + $\frac{1}{2}$. Turn this off to judge the underlying approximation accuracy by comparison with MPFR. However, keep the default TRUE for back-compatibility.
minPrec: a positive integer; for stirlerr_simpl, the minimal accuracy or precision in bits when mpfr numbers are used.
nalgm: number of terms to use for Chebyshev polynomial approximation in lgammacor(). The default, 5, is the value hard wired in R's C Mathlib.
xbig: a large positive number; if x >= xbig, the simple asymptotic approximation lgammacor(x) := 1/(12*x) is used. The default, $2^{26.5} = 94906265.6$, is the value hard wired in R's C Mathlib.

Author

Martin Maechler

Details

% ../vignettes/log1pmx-etc.Rnw <<<<<<<<<

bd0():

Loader's “Binomial Deviance” function; for $x, M > 0$ (where the limit $x \to 0$ is allowed). In the case of dbinom, $x$ are integers (and $M = n p$), but in general x is real.

$$bd_0(x,M) := M \cdot D_0\bigl(\frac{x}{M}\bigr),$$ where $D_0(u) := u \log(u) + 1-u = u(\log(u) - 1) + 1$. Hence $$bd_0(x,M) = M \cdot \bigl(\frac{x}{M}(\log(\frac{x}{M}) -1) +1 \bigr) = x \log(\frac{x}{M}) - x + M.$$

A different way to rewrite this from Martyn Plummer, notably for important situation when $\left|x-M \right| \ll M$, is using $t := (x-M)/M$ (and $\left|t \right| \ll 1$ for that situation), equivalently, $\frac{x}{M} = 1+t$. Using $t$, $$bd_0(x,M) = \log(1+t) - t \cdot M = M \cdot [(t+1)(\log(1+t) - 1) + 1] = M \cdot [(t+1) \log(1+t) - t] = M \cdot p_1l_1(t),$$ and $$p_1l_1(t) := (t+1)\log(1+t) - t = \frac{t^2}{2} - \frac{t^3}{6} ...$$ where the Taylor series expansion is useful for small $|t|$.

Note that bd0(x, M) now also works when x and/or M are arbitrary-accurate mpfr-numbers (package Rmpfr).

References

C. Loader (2000), see dbinom's documentation.

Our package vignette log1pmx, bd0, stirlerr - Probability Computations in R.

Examples

Run this code

n <- seq(1, 50, by=1/4)
st.n <- stirlerr(n) # now vectorized
stopifnot(identical(st.n, sapply(n, stirlerr)))
plot(n, st.n, type = "b", log="xy", ylab = "stirlerr(n)")

x <- 800:1200
bd0x1k <- bd0(x, np = 1000)
plot(x, bd0x1k, type="l", ylab = "bd0(x, np=1000)")
bd0x1kC <- bd0C(x, np = 1000)
lines(x, bd0x1kC, col=2)
bd0.1d1 <- bd0_p1l1d1(x, 1000)
bd0.1d  <- bd0_p1l1d (x, 1000)
bd0.1pm <- bd0_l1pm  (x, 1000)
stopifnot(exprs = {
    all.equal(bd0x1kC, bd0x1k,  tol=1e-14) # even tol=0 currently ..
    all.equal(bd0x1kC, bd0.1d1, tol=1e-14)
    all.equal(bd0x1kC, bd0.1d , tol=1e-14)
    all.equal(bd0x1kC, bd0.1pm, tol=1e-14)
})

str(log1pmx) ##--> play with  { tol_logcf, eps2, minL1, trace.lcf, logCF }

ebd0x1k <- ebd0 (x, 1000)
exC     <- ebd0C(x, 1000)
stopifnot(all.equal(exC, ebd0x1k, tol=4e-16))
lines(x, rowSums(ebd0x1k), col=adjustcolor(4, 1/2), lwd=4)

x <- 0:250
dp   <- dpois    (x, 48, log=TRUE)# R's 'stats' pkg function
dp.r <- dpois_raw(x, 48, log=TRUE)
all.equal(dp, dp.r, tol = 0) # on Linux 64b, see TRUE
stopifnot(all.equal(dp, dp.r, tol = 1e-14))
## dpois_raw()  versions:
(vers <- eval(formals(dpois_raw)$version))
mv <- sapply(vers, function(v) dpois_raw(x, 48, version=v))
matplot(x, mv, type="h", log="y", main="dpois_raw(x, 48, version=*)") # "fine"

if(all(mv[,"ebd0_C1"] == mv[,"ebd0_v1"])) {
    cat("versions 'ebd0_C1' and 'ebd0_v1' are identical for lambda=48\n")
    mv <- mv[, vers != "ebd0_C1"]
}
## now look at *relative* errors -- need "Rmpfr" for "truth"
if(requireNamespace("Rmpfr")) {

    dM <- Rmpfr::dpois(Rmpfr::mpfr(x, 256), 48)
    asN <- Rmpfr::asNumeric
    relE <- asN(mv / dM - 1)
    cols <- adjustcolor(1:ncol(mv), 1/2)

    mtit <- "relative Errors of dpois_raw(x, 48, version = * )"
    matplot(x, relE, type="l", col=cols, lwd=3, lty=1, main=mtit)
    legend("topleft", colnames(mv), col=cols, lwd=3, bty="n")

    matplot(x, abs(relE), ylim=pmax(1e-18, range(abs(relE))), type="l", log="y",
            main=mtit, col=cols, lwd=2, lty=1, yaxt="n")
    sfsmisc::eaxis(2)
    legend("bottomright", colnames(mv), col=cols, lwd=2, bty="n", ncol=3)
    ee <- c(.5, 1, 2)* 2^-52; eC <- quote(epsilon[C])
    abline(h=ee, lty=2, col="gray", lwd=c(1,2,1))
    axis(4, at=ee[2:3], expression(epsilon[C], 2 * epsilon[C]), col="gray", las=1)
    par(new=TRUE)
    plot(x, asN(dM), type="h", col=adjustcolor("darkgreen", 1/3), axes=FALSE, ann=FALSE)
    stopifnot(abs(relE) < 8e-13) # seen 2.57e-13
}# Rmpfr

Run the code above in your browser using DataLab