coversim: Confidence Region Coverage

Description

Creates a confidence region and determines coverage results for a corresponding point of interest. Iterates through a user specified number of trials. Each trial uses a random dataset with user-specified parameters (default) or a user specified dataset matrix ('n' samples per column, 'iter' columns) and returns the corresponding actual coverage results. See the CRAN website https://CRAN.R-project.org/package=conf for a link to a coversim vignette.

Usage

coversim(alpha, distn,
                n         = NULL,
                iter      = NULL,
                dataset   = NULL,
                point     = NULL,
                seed      = NULL,
                a         = NULL,
                b         = NULL,
                kappa     = NULL,
                lambda    = NULL,
                mu        = NULL,
                s         = NULL,
                sigma     = NULL,
                theta     = NULL,
                heuristic = 1,
                maxdeg    = 5,
                ellipse_n = 4,
                pts       = FALSE,
                mlelab    = TRUE,
                sf        = c(5, 5),
                mar       = c(4, 4.5, 2, 1.5),
                xlab      = "",
                ylab      = "",
                main      = "",
                xlas      = 0,
                ylas      = 0,
                origin    = FALSE,
                xlim      = NULL,
                ylim      = NULL,
                tol       = .Machine$double.eps ^ 1,
                info      = FALSE,
                returnsamp  = FALSE,
                returnquant = FALSE,
                repair    = TRUE,
                exact     = FALSE,
                showplot  = FALSE,
                delay     = 0 )

Value

If the optional argument info = TRUE is included then a list of coverage results is returned. That list includes alpha value(s), n value(s), coverage and error results per iteration. Additionally, returnsamp = TRUE

and/or returnquant = TRUE will result in an n row, iter column maxtix of sample and/or sample cdf values.

Arguments

alpha: significance level; scalar or vector; resulting plot illustrates a 100(1 - alpha)% confidence region.
distn: distribution to fit the dataset to; accepted values: 'cauchy', 'gamma', 'invgauss', 'logis', 'llogis', 'lnorm', 'norm', 'unif', 'weibull'.
n: trial sample size (producing each confidence region); scalar or vector; needed if a dataset is not given.
iter: iterations (or replications) of individual trials per parameterization; needed if a dataset is not given.
dataset: a 'n' x 'iter' matrix of dataset values, or a vector of length 'n' (for a single iteration).
point: coverage is assessed relative to this point.
seed: random number generator seed.
a: distribution parameter (when applicable).
b: distribution parameter (when applicable).
kappa: distribution parameter (when applicable).
lambda: distribution parameter (when applicable).
mu: distribution parameter (when applicable).
s: distribution parameter (when applicable).
sigma: distribution parameter (when applicable).
theta: distribution parameter (when applicable).
heuristic: numeric value selecting method for plotting: 0 for elliptic-oriented point distribution, and 1 for smoothing boundary search heuristic.
maxdeg: maximum angle tolerance between consecutive plot segments in degrees.
ellipse_n: number of roughly equidistant confidence region points to plot using the elliptic-oriented point distribution (must be a multiple of four because its algorithm exploits symmetry in the quadrants of an ellipse).
pts: displays confidence region boundary points if TRUE (applies to confidence region plots in which showplot = TRUE).
mlelab: logical argument to include the maximum likelihood estimate coordinate point (default is TRUE, applies to confidence region plots when showplot = TRUE).
sf: significant figures in axes labels specified using sf = c(x, y), where x and y represent the optional digits argument in the R function round as it pertains the horizontal and vertical labels.
mar: specifies margin values for par(mar = c( )) (see mar in par).
xlab: string specifying the horizontal axis label (applies to confidence region plots when showplot = TRUE).
ylab: string specifying the vertical axis label (applies to confidence region plots when showplot = TRUE).
main: string specifying the plot title (applies to confidence region plots when showplot = TRUE).
xlas: numeric in 0, 1, 2, 3 specifying the style of axis labels (see las in par, applies to confidence region plots when showplot = TRUE).
ylas: numeric in 0, 1, 2, 3 specifying the style of axis labels (see las in par, applies to confidence region plots when showplot = TRUE).
origin: logical argument to include the plot origin (applies to confidence region plots when showplot = TRUE).
xlim: two element vector containing horizontal axis minimum and maximum values (applies to confidence region plots when showplot = TRUE).
ylim: two element vector containing vertical axis minimum and maximum values (applies to confidence region plots when showplot = TRUE).
tol: the uniroot parameter specifying its required accuracy.
info: logical argument to return coverage information in a list; includes alpha value(s), n value(s), coverage and error results per iteration, and returnsamp and/or returnquant when requested.
returnsamp: logical argument; if TRUE returns random samples used in a matrix with n rows, iter cols.
returnquant: logical argument; if TRUE returns random quantiles used in a matrix with n rows, iter cols.
repair: logical argument to repair regions inaccessible using a radial angle from its MLE (multiple root azimuths).
exact: logical argument specifying if alpha value is adjusted to compensate for negative coverage bias in order to achieve (1 - alpha) coverage probability using previously recorded Monte Carlo simulation results; available for limited values of alpha (roughly <= 0.2--0.3), n (typically n = 4, 5, ..., 50) and distributions (distn suffixes: weibull, llogis, norm).
showplot: logical argument specifying if each coverage trial produces a plot.
delay: numeric value of delay (in seconds) between trials so its plot can be seen (applies when showplot = TRUE).

Author

Christopher Weld (ceweld241@gmail.com)

Lawrence Leemis (leemis@math.wm.edu)

Details

Parameterizations for supported distributions are given following the default axes convention in use by crplot and coversim, which are:

	Horizontal	Vertical
Distribution	Axis	Axis
Cauchy	$a$	$s$
gamma	$\theta$	$\kappa$
inverse Gaussian	$\mu$	$\lambda$
log logistic	$\lambda$	$\kappa$
log normal	$\mu$	$\sigma$
logistic	$\mu$	$\sigma$
normal	$\mu$	$\sigma$
uniform	$a$	$b$
Weibull	$\kappa$	$\lambda$

Each respective distribution is defined below.

The Cauchy distribution for the real-numbered location parameter $a$, scale parameter $s$, and $x$ is a real number, has the probability density function $$1 / (s \pi (1 + ((x - a) / s) ^ 2)).$$
The gamma distribution for shape parameter $\kappa > 0$, scale parameter $\theta > 0$, and $x > 0$, has the probability density function $$1 / (Gamma(\kappa) \theta ^ \kappa) x ^ {(\kappa - 1)} \exp(-x / \theta).$$
The inverse Gaussian distribution for mean $\mu > 0$, shape parameter $\lambda > 0$, and $x > 0$, has the probability density function $$\sqrt{(\lambda / (2 \pi x ^ 3))} \exp( - \lambda (x - \mu) ^ 2 / (2 \mu ^ 2 x)).$$
The log logistic distribution for scale parameter $\lambda > 0$, shape parameter $\kappa > 0$, and $x > 0$, has a probability density function $$(\kappa \lambda) (x \lambda) ^ {(\kappa - 1)} / (1 + (\lambda x) ^ \kappa) ^ 2.$$
The log normal distribution for the real-numbered mean $\mu$ of the logarithm, standard deviation $\sigma > 0$ of the logarithm, and $x > 0$, has the probability density function $$1 / (x \sigma \sqrt{2 \pi}) \exp(-(\log x - \mu) ^ 2 / (2 \sigma ^ 2)).$$
The logistic distribution for the real-numbered location parameter $\mu$, scale parameter $\sigma$, and $x$ is a real number, has the probability density function $$(1 / \sigma) \exp((x - \mu) / \sigma) (1 + \exp((x - \mu) / \sigma)) ^ {-2}$$
The normal distribution for the real-numbered mean $\mu$, standard deviation $\sigma > 0$, and $x$ is a real number, has the probability density function $$1 / \sqrt{2 \pi \sigma ^ 2} \exp(-(x - \mu) ^ 2 / (2 \sigma ^ 2)).$$
The uniform distribution for real-valued parameters $a$ and $b$ where $a < b$ and $a \le x \le b$, has the probability density function $$1 / (b - a).$$
The Weibull distribution for scale parameter $\lambda > 0$, shape parameter $\kappa > 0$, and $x > 0$, has the probability density function $$\kappa (\lambda ^ \kappa) x ^ {(\kappa - 1)} \exp(-(\lambda x) ^ \kappa).$$

References

C. Weld, A. Loh, L. Leemis (2020), "Plotting Two-Dimensional Confidence Regions", The American Statistician, Volume 72, Number 2, 156--168.

Examples

Run this code

## assess actual coverage at various alpha = {0.5, 0.1} given n = 30 samples,  completing
## 10 trials per parameterization (iter) for a normal(mean = 2, sd = 3) rv
coversim(alpha = c(0.5, 0.1), "norm", n = 30, iter = 10, mu = 2, sigma = 3)

## show plots for 5 iterations of 30 samples each from a Weibull(2, 3)
coversim(0.5, "weibull", n = 30, iter = 5, lambda = 1.5, kappa = 0.5, showplot = TRUE,
origin = TRUE)

Run the code above in your browser using DataLab

	Horizontal	Vertical
Distribution	Axis	Axis
Cauchy	\(a\)	\(s\)
gamma	\(\theta\)	\(\kappa\)
inverse Gaussian	\(\mu\)	\(\lambda\)
log logistic	\(\lambda\)	\(\kappa\)
log normal	\(\mu\)	\(\sigma\)
logistic	\(\mu\)	\(\sigma\)
normal	\(\mu\)	\(\sigma\)
uniform	\(a\)	\(b\)
Weibull	\(\kappa\)	\(\lambda\)