discretize: Discretization of a Continuous Distribution

Description

Compute a discrete probability mass function from a continuous cumulative distribution function (cdf) with various methods.

discretise is an alias for discretize.

Usage

discretize(cdf, from, to, step = 1,
           method = c("upper", "lower", "rounding", "unbiased"),
           lev, by = step, xlim = NULL)
discretise(cdf, from, to, step = 1,
           method = c("upper", "lower", "rounding", "unbiased"),
           lev, by = step, xlim = NULL)

Arguments

cdf

an expression written as a function of x, or alternatively the name of a function, giving the cdf to discretize.

from, to

the range over which the function will be discretized.

step

numeric; the discretization step (or span, or lag).

method

discretization method to use.

lev

an expression written as a function of x, or alternatively the name of a function, to compute the limited expected value of the distribution corresponding to cdf. Used only with the "unbiased" method.

an alias for step.

xlim

numeric of length 2; if specified, it serves as default for c(from, to).

Value

A numeric vector of probabilities suitable for use in aggregateDist.

Details

Usage is similar to curve.

discretize returns the probability mass function (pmf) of the random variable obtained by discretization of the cdf specified in cdf.

Let $F(x)$ denote the cdf, $E[\min(X, x)]$ the limited expected value at $x$, $h$ the step, $p_x$ the probability mass at $x$ in the discretized distribution and set $a =$ from and $b =$ to.

Method "upper" is the forward difference of the cdf $F$: $$p_x = F(x + h) - F(x)$$ for $x = a, a + h, \dots, b - step$.

Method "lower" is the backward difference of the cdf $F$: $$p_x = F(x) - F(x - h)$$ for $x = a + h, \dots, b$ and $p_a = F(a)$.

Method "rounding" has the true cdf pass through the midpoints of the intervals $[x - h/2, x + h/2)$: $$p_x = F(x + h/2) - F(x - h/2)$$ for $x = a + h, \dots, b - step$ and $p_a = F(a + h/2)$. The function assumes the cdf is continuous. Any adjusment necessary for discrete distributions can be done via cdf.

Method "unbiased" matches the first moment of the discretized and the true distributions. The probabilities are as follows: $$p_a = \frac{E[\min(X, a)] - E[\min(X, a + h)]}{h} + 1 - F(a)$$ $$p_x = \frac{2 E[\min(X, x)] - E[\min(X, x - h)] - E[\min(X, x + h)]}{h}, \quad a < x < b$$ $$p_b = \frac{E[\min(X, b)] - E[\min(X, b - h)]}{h} - 1 + F(b),$$

References

Klugman, S. A., Panjer, H. H. and Willmot, G. E. (2012), Loss Models, From Data to Decisions, Fourth Edition, Wiley.

Examples

Run this code

# NOT RUN {
x <- seq(0, 5, 0.5)

op <- par(mfrow = c(1, 1), col = "black")

## Upper and lower discretization
fu <- discretize(pgamma(x, 1), method = "upper",
                 from = 0, to = 5, step = 0.5)
fl <- discretize(pgamma(x, 1), method = "lower",
                 from = 0, to = 5, step = 0.5)
curve(pgamma(x, 1), xlim = c(0, 5))
par(col = "blue")
plot(stepfun(head(x, -1), diffinv(fu)), pch = 19, add = TRUE)
par(col = "green")
plot(stepfun(x, diffinv(fl)), pch = 19, add = TRUE)
par(col = "black")

## Rounding (or midpoint) discretization
fr <- discretize(pgamma(x, 1), method = "rounding",
                 from = 0, to = 5, step = 0.5)
curve(pgamma(x, 1), xlim = c(0, 5))
par(col = "blue")
plot(stepfun(head(x, -1), diffinv(fr)), pch = 19, add = TRUE)
par(col = "black")

## First moment matching
fb <- discretize(pgamma(x, 1), method = "unbiased",
                 lev = levgamma(x, 1), from = 0, to = 5, step = 0.5)
curve(pgamma(x, 1), xlim = c(0, 5))
par(col = "blue")
plot(stepfun(x, diffinv(fb)), pch = 19, add = TRUE)

par(op)
# }

Run the code above in your browser using DataLab