activeSetLogCon: Computes a Log-Concave Probability Density Estimate via an Active Set Algorithm

Description

Given a vector of observations ${\bold{x}_n} = (x_1, \ldots, x_n)$ with not necessarily equal entries, activeSetLogCon first computes vectors ${\bold{x}_m} = (x_1, \ldots, x_m)$ and ${\bold{w}} = (w_1, \ldots, w_m)$ where $w_i$ is the weight of each $x_i$ s.t. $\sum_{i=1}^m w_i = 1$. Then, activeSetLogCon computes a concave, piecewise linear function $\widehat \phi_m$ on $[x_1, x_m]$ with knots only in ${x_1, \ldots, x_m}$ such that $$L(\phi) = \sum_{i=1}^m w_i \phi(x_i) - \int_{-\infty}^\infty \exp(\phi(t)) dt$$ is maximal. To accomplish this, an active set algorithm is used.

This function is as it is in the logcondens package except we've added the 'prec' variable as an argument and modified the the values returned as output, to be in line with the activeSetLogCon.mode function.

Usage

activeSetLogCon(x, xgrid = NULL, print = FALSE, w = NA,
prec=10^-10)

Arguments

Vector of independent and identically distributed numbers, not necessarily unique.

xgrid

Governs the generation of weights for observations. See preProcess for details.

print = TRUE outputs the log-likelihood in every loop, print = FALSE does not. Make sure to tell R to output (press CTRL+W).

Optional vector of weights. If weights are provided, i.e., if w != NA, then xgrid is ignored.

prec

Governs precision of various subfunctions, e.g. the Newton-Raphson procedure.

Value

xnVector with initial observations $x_1, \ldots, x_n$.
xVector of observations $x_1, \ldots, x_m$ that was used to estimate the density, i.e. points that include all possible knots of the estimate. Note that this $x$ is not identical to the $x$ passed in (xn is identical).
wThe vector of weights that had been used. Depends on the chosen setting for xgrid. Of the same length as x.
LThe value $L(\widehat {\bold{\phi}}_m)$ of the log-likelihood-function $L$ at the maximum $\widehat {\bold{\phi}}_m$.
IsKnotVector with entries IsKnot$_i = 1{\widehat{\phi}_m$ has a kink at $x_i$}.
knotsknots equals x[IsKnot>0], gives the values of the points that are knots.
phiVector with entries $\widehat \phi_m(x_i)$, $i=1,\ldots,m$. Named "phi" not "phihat" for backwards compatibility.
fhatVector with entries $\widehat{f}_m(x_i) = e^{\widehat{\phi}_m(x_i)}$, $i=1,\ldots, m$.
FhatA vector $(\widehat F_{m,i})_{i=1}^m$ of the same size as ${\bold{x}}$ with entries
$$\widehat F_{m,i} = \int_{x_1}^{x_i} \exp(\widehat \phi_m(t)) dt.$$
HNumeric vector $(H_1, \ldots, H_{m})'$ where $H_i$ is the derivative of $$t \to L(\phi + t\Delta_i)$$ at zero and $\Delta_i(x) = \min(x - x_i, 0)$
nNumber of initial observations.
mNumber of points used to compute the estimator, either unique observations or output from preProcess.
modeMode of the estimated density $\hat f_m$. This is redundant with dlcMode, but is included for backwards compatibility with the logcondens package.
dlcModeA list, of class "dlc.mode", with components $val, $idx, and $isx. dlcMode$val gives the mode estimate value, dlcMode$idx gives the corresponding index in x. dlcMode$isx is always TRUE. (dlMode$isx is sometimes FALSE when a dlc.mode object is output from activeSetLogCon.mode.)
sigThe standard deviation of the initial sample $x_1, \ldots, x_n$.
phi.fAll outputs named "name.f" are functions corresponding to name. So, phi.f(x) equals $\widehat{\phi}_m(x)$.
fhat.fIs a function such that fhat.f(x) equals $\widehat{f}_m(x)$.
Fhat.fIs a function such that Fhat.f(x) equals $\widehat{F}_m(x)$.
E.fE.f(l,u) $= \int_{l}^{u} \widehat{F}_m(t) dt$
phiPLNumeric vector of length $m$ with values $\widehat{\phi}_m'(x_i-)$
phiPRNumeric vector of length $m$ with values $\widehat{\phi}_m'(x_i+)$
phiPL.fIs a function such that phiPL.f(x) equals $\widehat{\phi}_m'(x-)$.
phiPR.fIs a function such that phiPR.f(x) equals $\widehat{\phi}_m'(x+)$.

References

Duembgen, L, Huesler, A. and Rufibach, K. (2010) Active set and EM algorithms for log-concave densities based on complete and censored data. Technical report 61, IMSV, Univ. of Bern, available at http://arxiv.org/abs/0707.4643.

Duembgen, L. and Rufibach, K. (2009) Maximum likelihood estimation of a log--concave density and its distribution function: basic properties and uniform consistency. Bernoulli, 15(1), 40--68.

Duembgen, L. and Rufibach, K. (2011) logcondens: Computations Related to Univariate Log-Concave Density Estimation. Journal of Statistical Software, 39(6), 1--28. http://www.jstatsoft.org/v39/i06

Examples

Run this code

## estimate gamma density
set.seed(1977)
n <- 200
x <- rgamma(n, 2, 1)
res <- activeSetLogCon(x, w = rep(1 / n, n), print = FALSE)

## plot resulting functions
par(mfrow = c(2, 2), mar = c(3, 2, 1, 2))
plot(res$x, exp(res$phi), type = 'l'); rug(x)
plot(res$x, res$phi, type = 'l'); rug(x)
plot(res$x, res$Fhat, type = 'l'); rug(x)
plot(res$x, res$H, type = 'l'); rug(x)

## compute and plot function values at an arbitrary point
x0 <- (res$x[100] + res$x[101]) / 2
Fx0 <- evaluateLogConDens(x0, res, which = 3)[, "CDF"]
plot(res$x, res$Fhat, type = 'l'); rug(res$x)
abline(v = x0, lty = 3); abline(h = Fx0, lty = 3)

## compute and plot 0.9-quantile of Fhat
q <- quantilesLogConDens(0.9, res)[2]
plot(res$x, res$Fhat, type = 'l'); rug(res$x)
abline(h = 0.9, lty = 3); abline(v = q, lty = 3)

Run the code above in your browser using DataLab