stirlerr: Stirling's Error Function - Auxiliary for Gamma, Beta, etc

Description

Stirling's approximation (to the factorial or $\Gamma$ function) error in $\log$ scale is the difference of the left and right hand side of Stirling's approximation to $n!$, $n! \approx \bigl(\frac{n}{e}\bigr)^n \sqrt{2\pi n},$ i.e., stirlerr(n) := $\delta(n)$, where $$\delta(n) = \log\Gamma(n + 1) - n\log(n) + n - \log(2 \pi n)/2.$$

Partly, pure R transcriptions of the C code utility functions for dgamma(), dbinom(), dpois(), dt(), and similar “base” density functions by Catherine Loader.

These DPQ versions typically have extra arguments with defaults that correspond to R's Mathlib C code hardwired cutoffs and tolerances.

lgammacor(x) is “the same” as stirlerr(x), both computing $delta(x)$ accurately, however is only defined for $x \ge 10$, and has been crucially used for R's own lgamma() and lbeta() computations.

Usage


stirlerr(n, scheme = c("R3", "R4.4_0"),
         cutoffs = switch(scheme
                        , R3     = c(15, 35, 80, 500)
                        , R4.4_0 = c(5.25, rep(6.5, 4), 7.1, 7.6, 8.25, 8.8, 9.5, 11,
                                     14, 19,   25, 36, 81, 200, 3700, 17.4e6)
                        
                        
                        
                        
                          ),
         use.halves = missing(cutoffs),
         direct.ver = c("R3", "lgamma1p", "MM2", "n0"),
         order = NA,
         verbose = FALSE)
stirlerrC(n, version = c("R3", "R4..1", "R4.4_0"))
stirlerr_simpl(n, version = c("R3", "lgamma1p", "MM2", "n0"), minPrec = 128L)
lgammacor(x, nalgm = 5, xbig = 2^26.5)

Value

a numeric vector “like”

x; in some cases may also be an (high accuracy) "mpfr"-number vector, using CRAN package Rmpfr.

lgammacor(x) originally returned NaN for all $|x| < 10$, as its Chebyshev polynomial approximation has been constructed for

$x \in [10, xbig]$, specifically for $u \in [-1,1]$ where

$t := 10/x \in [1/x_B, 1]$ and

$u := 2t^2 -1 \in [-1 + \epsilon_B, 1]$.

Arguments

x, n: numeric (or number-alike such as "mpfr").
verbose: logical indicating if some information about the computations are to be printed.
version: a character string specifying the version of stirlerr_simpl() or stirlerrC().
scheme: a character string specifying the cutoffs scheme for stirlerr().
cutoffs: an increasing numeric vector, required to start with with cutoffs[1] <= 15 specifying the cutoffs to switch from 2 to 3 to ..., up to 10 term approximations for non-small n, where the direct formula loses precision. When missing (as by default), scheme is used, where scheme = "R3" chooses (15, 35, 80, 500), the cutoffs in use in R versions up to (and including) 4.3.z.
use.halves: logical indicating if the full-accuracy prestored values should be use when $2n \in \{0,1,\dots,30\}$, i.e., $n \le 15$ and n is integer or integer + $\frac{1}{2}$. Turn this off to judge the underlying approximation accuracy by comparison with MPFR. However, keep the default TRUE for back-compatibility.

direct.ver: a character string specifying the version of stirlerr_simpl() to be used for the “direct” case in stirlerr(n).
order: approximation order, 1 <= order <= 20 or NA for stirlerr(). If not NA, it specifies the number of terms to be used in the Stirling series which will be used for all n, i.e., scheme, cutoffs, use.halves, and direct.ver are irrelevant.
minPrec: a positive integer; for stirlerr_simpl the minimal accuracy or precision in bits when mpfr numbers are used.

nalgm: number of terms to use for Chebyshev polynomial approximation in lgammacor(). The default, 5, is the value hard wired in R's C Mathlib.
xbig: a large positive number; if x >= xbig, the simple asymptotic approximation lgammacor(x) := 1/(12*x) is used. The default, $2^{26.5} = 94906265.6$, is the value hard wired in R's C Mathlib.

Author

Martin Maechler

Details

% ../vignettes/log1pmx-etc.Rnw <<<<<<<<<

stirlerr():

Stirling's error, stirlerr(n):= $\delta(n)$ has asymptotic ($n \to\infty$) expansion $$\delta(n) = \frac 1{12 n} - \frac 1{360 n^3} + \frac 1{1260 n^5} \pm O(n^{-7}),$$ and this expansion is used up to remainder $O(n^{-35})$ in current (package DPQ) stirlerr(n); different numbers of terms between different cutoffs for $n$, and using the direct formula for $n <= c_1$, where $c_1$ is the first cutoff, cutoff[1].

Note that (new in 2024-01) stirlerr(n, order = k) will not use cutoffs nor the direct formula (with its direct.ver), nor halves (use.halves=TRUE), and allows $k \le 20$. Tests seem to indicate that for current double precision arithmetic, only $k \le 17$ seem to make sense.

lgammacor(x):

The “same” Stirling's error, but only defined for $x \ge 10$, returning NaN otherwise. The example below suggests that R's hardwired default of nalgm = 5 loses more than one digit accuracy, and nalgm = 6 seems much better. OTOH, the use of lgammacor() in R's (Mathlib/libRmath) C code is always in conjunction with considerably larger terms such that small inaccuracies in lgammacor() will not become visible in the values of the functions using lgammacor() internally, notably lbeta() and lgamma().

References

C. Loader (2000), see dbinom's documentation.

Our package vignette log1pmx, bd0, stirlerr - Probability Computations in R.

Examples

Run this code

n <- seq(1, 50, by=1/4)
st.n <- stirlerr(n) # now vectorized
stopifnot(identical(st.n, sapply(n, stirlerr)))
st3. <- stirlerr(n, "R3", direct.ver = "R3") # previous default
st3  <- stirlerr(n, "R3", direct.ver = "lgamma1p") # new? default
## for these n, there is *NO* difference:
stopifnot(st3 == st3.)
plot(n, st.n, type = "b", log="xy", ylab = "stirlerr(n)")
st4 <- stirlerr(n, "R4.4_0", verbose = TRUE) # verbose: give info on cases
## order = k = 1:20  terms in series approx:
k <- 1:20
stirlOrd <- sapply(k, function(k) stirlerr(n, order = k))
matlines(n, stirlOrd)
matplot(n, stirlOrd - st.n, type = "b", cex=1/2, ylim = c(-1,1)/10, log = "x",
        main = substitute(list(stirlerr(n, order=k) ~~"error", k == 1:mK),  list(mK = max(k))))

matplot(n, abs(stirlOrd - st.n), type = "b", cex=1/2, log = "xy",
        main = "| stirlerr(n, order=k) error |")
mtext(paste("k =", deparse(k))) ; abline(h = 2^-(53:51), lty=3, lwd=1/2)
colnames(stirlOrd) <- paste0("k=", k)

stCn <- stirlerrC(n)
all.equal(st.n, stCn, tolerance = 0)  # see 6.7447e-14
stopifnot(all.equal(st.n, stCn, tolerance = 1e-12))
stC2 <- stirlerrC(n, version = "R4..1")
stC4 <- stirlerrC(n, version = "R4.4_0")


## lgammacor(n) : only defined for n >= 10
lgcor <- lgammacor(n)
lgcor6 <- lgammacor(n, nalgm = 6) # more accurate?

all.equal(lgcor[n >= 10], st.n[n >= 10], tolerance=0)# .. rel.diff.: 4.687e-14
stopifnot(identical(is.na(lgcor), n < 10),
          all.equal(lgcor[n >= 10],
                    st.n [n >= 10], tolerance = 1e-12))

## look at *relative* errors -- need "Rmpfr" for "truth" % Rmpfr / DPQmpfr in 'Suggests'
if(requireNamespace("Rmpfr") && requireNamespace("DPQmpfr")) {
    ## stirlerr(n) uses DPQmpfr::stirlerrM()  automagically when n is 
    relErrV <- sfsmisc::relErrV; eaxis <- sfsmisc::eaxis
    mpfr <- Rmpfr::mpfr;     asNumeric <- Rmpfr::asNumeric
    stM <- stirlerr(mpfr(n, 512))
    relE <- asNumeric(relErrV(stM, cbind(st3, st4, stCn, stC4,
                                         lgcor, lgcor6, stirlOrd)))

    matplot(n, pmax(abs(relE),1e-20), type="o", cex=1/2, log="xy", ylim =c(8e-17, 0.1),
            xaxt="n", yaxt="n", main = quote(abs(relErr(stirlerr(n)))))
    ## mark "lgcor*" -- lgammacor() particularly !
    col.lgc <- adjustcolor(c(2,4), 2/3)
    matlines(n, abs(relE[,c("lgcor","lgcor6")]), col=col.lgc, lwd=3)
    lines(n, abs(relE[,"lgcor6"]), col=adjustcolor(4, 2/3), lwd=3)
    eaxis(1, sub10=2); eaxis(2); abline(h = 2^-(53:51), lty=3, col=adjustcolor(1, 1/2))
    axis(1, at=15, col=NA, line=-1); abline(v=c(10,15), lty=2, col=adjustcolor(1, 1/4))
    legend("topright", legend=colnames(relE), cex = 3/4,
           col=1:6, lty=1:5, pch= c(1L:9L, 0L, letters)[seq_len(ncol(relE))])
    legend("topright", legend=colnames(relE)[1:6], cex = 3/4, lty=1:5, lwd=3,
           col=c(rep(NA,4), col.lgc), bty="n")
    ## Note that lgammacor(x) {default, n=5} is clearly inferior,
    ## but lgammacor(x, 6) is really good in [10, 50]

    ## ===> Larger n's:
    nL <- c(seq(50, 99, by = 1/2), 100*2^seq(0,8, by = 1/4))
    stMl <- stirlerr(mpfr(nL, 512))
    lgc5 <- lgammacor(nL, nalgm = 5)
    lgc6 <- lgammacor(nL, nalgm = 6)
    stir7 <- stirlerr(nL, order = 7)
    relEl <- asNumeric(relErrV(stMl,
               cbind(lgammacor.5 = lgc5, lgammacor.6 = lgc6, 'stirlerr_k=7' = stir7)))
    matplot(nL, pmax(abs(relEl),2^-55), type="o", cex=2/3, log="xy",
            ylim = c(2^-54.5, max(abs(relEl))), ylab = quote(abs(rE)),
            xaxt="n", yaxt="n", main = quote(abs(relErr(stirlerr(n)))))
    eaxis(1, sub10=2); eaxis(2, cex.axis=.8)
    abline(h = 2^-(54:51), lty=3, col=adjustcolor(1, 1/2))
    legend("center", legend=colnames(relEl), lwd=2, pch = paste(1:3), col=1:3, lty=1:3)
}# end if(  )

Run the code above in your browser using DataLab