Learn R Programming

drmdel (version 1.3.2)

densityDRM: Estimate the density of the populations under the DRM

Description

Suppose we have m+1 samples, labeled as \(0, \, 1, \, \ldots, \, m\), whose population distributions satisfy the density ratio model (DRM) (see drmdel for the definition of DRM). The densityDRM function estimates the density, \(dF_k(x)\), of the k\(\textsuperscript{th}\), \(k=0, \, 1, \, \ldots, \, m\), population using a kernel density estimator with weights been the estimated \(dF_k(x)\)'s at the observed data points.

Usage

densityDRM(k, drmfit, interpolation=TRUE, ...)

Arguments

k

the label of the population whose density is to be estimated, k = 0, 1, ..., m. It must be a single value, not a vector.

drmfit

a fitted DRM object (an output from the drmdel function). See drmdel for details.

interpolation

a logical variable to be passed to the function quantileDRM and then ultimately to the function cdfDRM, for estimating the population standard deviations and IQRs required for calculating the default bandwidth for the kernel density estimator. See quantileDRM and cdfDRM for details. The default value is TRUE.

...

further arguments to be passed to the density function, which performs kernel density estimation in R. See help(density) for details. One should not pass "x" and "weights" arguments to the density function, since those are supposed to be extracted automatically from the fitted DRM object, drmfit.

Value

An output from the density function, usually an object with class '"density"'. See help(density) for details.

Details

Note that the default bandwidth for this density estimator is set as that described in Chen and Liu (2013): $$1.06 n_{total}^{-1/5} \min(\sigma_k, \, \mbox{IQR}_k/1.34)$$, where \(n_{total}\) is the total sample size, and \(sigma_k\) and \(IQR_k\) are the standard deviation and inter-quartile range of the estimated CDF \(F_k\), respectively.

If bandwidth (bw) is not specfied as a ... argument passing to the R functin density, the default bandwidth (as described above) will be calculated. That calculation involves the estimation of population quartiles. In this situation, the arguments "interpolation" is passed to quantileDRM for quartile estimations. See quantileDRM for details.

References

K. Fokianos (2004), Merging information for semiparametric density estimation. Journal of the Royal Statistical Society, Series B (Statistical Methodology), 66(4):941-958.

J. Chen and Y. Liu (2013), Quantile and quantile-function estimations under density ratio model. The Annals of Statistics, 41(3):1669-1692.

Examples

Run this code
# NOT RUN {
# Data generation
set.seed(25)
n_samples <- c(100, 200, 180, 150, 175)  # sample sizes
x0 <- rgamma(n_samples[1], shape=5, rate=1.8)
x1 <- rgamma(n_samples[2], shape=12, rate=1.2)
x2 <- rgamma(n_samples[3], shape=12, rate=1.2)
x3 <- rgamma(n_samples[4], shape=18, rate=5)
x4 <- rgamma(n_samples[5], shape=25, rate=2.6)
x <- c(x0, x1, x2, x3, x4)

# Fit a DRM with the basis function q(x) = (x, log(abs(x))),
# which is the basis function for gamma family. This basis
# function is the built-in basis function 6.
drmfit <- drmdel(x=x, n_samples=n_samples, basis_func=6)

# Estimate the density of population 3 under the DRM
dens_pop3 <- densityDRM(k=3, drmfit=drmfit)

# Plot the estimated density
plot(dens_pop3, main=bquote(F[3]), ylim=range(c(0, 0.5)))

# Add the empirical kernel density estimation curve of F_3
# based on the third sample on the above density plot
lines(density(x3), col="blue", lty="28F8")

# Add the true density curve of F_3 on the above density
# plot
lines(seq(min(dens_pop3$y), max(dens_pop3$x), 0.01),
      dgamma(seq(min(dens_pop3$y), max(dens_pop3$x), 0.01),
             18, 5),
      type="l", col="red", lty="dotted")

legend(9, 0.5,
       legend=c("DRM density estimator",
                "Empirical kernel density estimator",
                "True density"),
       col=c("black", "blue", "red"),
       lty=c("solid", "28F8", "dotted"))
# }

Run the code above in your browser using DataLab