db: The db (“discretised Beta”) distribution.

Description

Density, distribution function, quantile function and random generation for the db distribution with parameters alpha, beta and ntop.

Usage

ddb(x, alpha, beta, ntop, zeta=FALSE, log=FALSE)
pdb(x, alpha, beta, ntop, zeta=FALSE)
qdb(p, alpha, beta, ntop, zeta=FALSE)
rdb(n, alpha, beta, ntop, zeta=FALSE)

Arguments

Numeric vector of values at which the “density” (probability mass function) ddb() and the cumulative distribution function pdb() are evaluated. Normally these would be integer values between nbot and ntop, but they need not be. Note that nbot is 0 if zeta is TRUE, and is 1 if zeta is FALSE. A result of 0 is returned by ddb() for values of x that do not satisfy the foregoing criterion. A warning is issued by ddb() if any of the values in x are non-integer. See section Note for a little more information. Missing values (NA) are allowed; the corresponding results are NA.

alpha

Positive scalar. The first “shape” parameter of the db distribution.

beta

Positive scalar. The second “shape” parameter of the db distribution.

ntop

Integer scalar, strictly greater than 1. The maximum possible value of the db distribution.

zeta

Logical scalar. Should zero origin indexing be used? I.e. should the range of values of the distribution be taken to be {0,1,2,...,ntop} rather than {1,2,...,ntop}? Setting zeta=TRUE may be useful for example when the values of the distribution are to be interpreted as counts.

log

Logical scalar. Should logs of the probabilities calculated by ddb() be returned, rather than the actual probabilities?

Vector of probablilities (i.e. values between 0 and 1). The corresponding quantiles of the db distribution are calculated by qdb(). Missing values (NA) are allowed.

Integer scalar. An independent sample of size n from the db distribution is generated by rdb().

Value

For ddb() and pdb() vectors of probabilities.
For qdb() a vector of quantiles.
For rdb() a vector of length n, of integers between nbot and ntop, independently sampled from the db distribution, where nbot is 1 if zeta is FALSE and is 0 if zeta is TRUE.

Details

In the predecessor of this package (hse versions 0.1-15 and earlier), the probability function of the distribution was calculated as dbeta(x/(ntop+1),alpha,beta)/ sum(dbeta((nbot:ntop)/(ntop+k),alpha,beta)) where nbot and k were set to 1 if zeta was FALSE, and nbot was set to 0 and k to 2 if zeta was TRUE.

However the probability function is calculated in a more “direct” manner, using an exponential family representation of this function. The Beta distribution is no longer called upon (although it still of course conceptually underlies the distribution).

The function ddb() is a probability mass function for an ad hoc finite discrete distribution of ordered values, with a “reasonably flexible” shape.

The \(p\)th quantile of a random variable \(X\) is defined to be the infimum over the range of \(X\) of those values of \(x\) such that \(F(x) \geq p\) where \(F(x)\) is the cumulative distribution function for \(X\). Note that if we did not impose the “over the range of \(X\)” restriction, then the 0th quantile of e.g. an exponential distribution would be \(-\infty\) (since \(F(x) \geq 0\) for all \(x\)) whereas we actually want this quantile to be 0.

Consequently qdb(p,alpha,beta,ntop) is equal to the least value of i such that pdb(i,alpha,beta,ntop) \(\geq\) p. The set of values of i to be considered is {1,2,...,ntop} if zeta is FALSE and is {0,1,2,...,ntop} if zeta is TRUE.

Examples

Run this code

# NOT RUN {
parz <- list(c(0.5,0.5),c(5,1),c(1,3),c(2,2),c(2,5))
for(i in 1:5) {
    p1 <- ddb(1:15,parz[[i]][1],parz[[i]][2],15)
    names(p1) <- 1:15
    eckslab <- paste0("alpha=",parz[[i]][1]," beta=",parz[[i]][2])
    barplot(p1,xlab=eckslab,main="db probabilities",
            space=1.5,col="black")
    abline(h=0)
    if(i < 5) readline("Go? ")
}
x <- c(-1.5,-1,-0.5,0,0.5,1,1.5)
ddb(x,2.5,1,5,TRUE) # Produces 0 for all but the 4th and 6th
                     # entries of x, and issues a warning.
# }