rEB.proc: Relevance-Integrated Empirical Bayes Inference

Description

Performs custom-tailored empirical Bayes inference via LASERs.

Usage

rEB.proc(X, z, X.target, z.target, m = c(4, 6), nbag = NULL, centering = TRUE,
	lp.reg.method = "lm", coef.smooth = "BIC", nsample = min(length(z),2000),
	theta.set.prior = NULL, theta.set.post = NULL, LP.type = "L2",
	g.method = "DL", sd0 = NULL, m.EB = 8, parallel = FALSE,
	avg.method = "mean", post.curve = "HPD", post.alpha = 0.8,
	color = "red", ...)

Arguments

A $n$-by-$d$ matrix of covariate values

A length $n$ vector containing observations of target random variable.

X.target

A length $d$ vector providing the set of covariates for the target case.

z.target

the target $z$ to investigate

An ordered pair. First number indicates how many LP-nonparametric basis to construct for each $X$, second number indicates how many to construct for $z$.

nbag

Number of bags of parametric bootstrapped samples to use, set to NULL to disable.

centering

Whether to perform regression-adjustment to center the data, default is TRUE.

lp.reg.method

Method for estimating the relevance function and its conditional LP-Fourier coefficients. We currently support thee options: lm (inbuilt with subset selection), glmnet, and knn.

coef.smooth

Specifies the method to use for LP coefficient smoothing (AIC or BIC). Uses BIC by default.

nsample

Number of relevance samples generated for the target case.

theta.set.prior

This indicates the set of grid points to compute prior density.

theta.set.post

This indicates the set of grid points to compute posterior density.

LP.type

User selects either "L2" for LP-orthogonal series representation of relevance density function $d$ or "MaxEnt" for the maximum entropy representation. Default is L2.

g.method

Suggested method for finding parameter estimates $\hat{\mu}$ and $\hat{\tau}^2$ for normal prior: "DL" uses Dersimonian and Lard technique; "SJ" uses Sidik-Jonkman; 'REML' uses restricted maximum likelihood; and "MoM" uses a method of moments technique.

sd0

Fixed standard deviation for $z|\theta$. Default is NULL, the standard error will be calculated from data.

m.EB

The truncation point reflecting the concentration of true nonparametric prior density $\pi$ around known prior distribution $g$

parallel

Use parallel computing for obtaining the relevance samples, mainly used for very huge nsample, default if FALSE.

avg.method

For parametric bootstrapping, this specifies how the results from different bags are aggregated. ("mean" or "median".)

post.curve

For plotting, this specifies what to show on posterior curve. "HPD" provides HPD interval, "band" gives confidence band.

post.alpha

Confidence level to use when plotting posterior confidence band, or the alpha level for HPD interval.

color

The color of the plots.

...

Extra parameters to pass to other functions. Currently only supports the arguments for knn().

Value

A list containing the following items:

result

Contains relevant empirical Bayes prior and posterior results.

sd0

Initial estimate for null standard errors.

prior

Relevant empirical Bayes prior results.

$g.par

Parameters for $g=N(\mu,\tau^2)$.

$g.method

Method used for finding the parameter estimates $\hat{\mu}$ and $\hat{\tau}^2$ for $g$.

$LP.coef

Reports the LP-coefficients of the relevance function $d_x(x)$.

posterior

Relevant empirical Bayes posterior results.

$post.mode

Posterior mode for $\pi(\theta|z,\boldsymbol{x})$.

$post.mean

Posterior mean for $\pi(\theta|z,\boldsymbol{x})$.

$post.mean.sd

Standard error for the posterior mean, when using parametric bootstrap.

$HPD.interval

The HPD interval for posterior $\pi(\theta|z,\boldsymbol{x})$.

$post.alpha

same as input post.alpha.

plots

The plots for prior and posterior density.

References

Mukhopadhyay, S., and Wang, K (2021) "On The Problem of Relevance in Statistical Inference". <arXiv:2004.09588>

Examples

Run this code

# NOT RUN {
data(funnel)
X<-funnel$x
z<-funnel$z
X.target=60
z.target=4.49
rEB.out<-rEB.proc(X,z,X.target,z.target,m=c(4,8),
	theta.set.prior=seq(-2,2,length.out=200),
	theta.set.post=seq(-2,5,length.out=200),
	centering=TRUE,m.EB=6,nsample=1000)
rEB.out$plots$rEB.post
rEB.out$plots$rEB.prior
# }

Run the code above in your browser using DataLab