Find the "best" fully-factorized approximation to the
posterior distribution of the coefficients, with linear regression
likelihood and mixture-of-normals priors on the coefficients. By
"best", we mean the approximating distribution that locally minimizes
the Kullback-Leibler divergence between the approximating distribution
and the exact posterior. In the original formulation (see
varbvs
), each regression coefficient was drawn
identically from a spike-and-slab prior. Here, we instead formulate
the ``slab'' as a mixture of normals.
varbvsmix(X, Z, y, sa, sigma, w, alpha, mu, update.sigma, update.sa,
update.w, w.penalty, drop.threshold = 1e-8, tol = 1e-4,
maxiter = 1e4, verbose = TRUE)
An object with S3 class c("varbvsmix","list")
.
Number of data samples used to fit model.
Posterior mean regression coefficients for covariates, including intercept.
If TRUE
, residual variance parameter
sigma
was fit to data.
If TRUE
, mixture variances were fit to data.
If TRUE
, mixture weights were fit to data.
Penalty used for updating mixture weights.
Posterior probabiltiy threshold used in the optimization procedure for setting mixture weights to zero.
Fitted or user-specified residual variance parameter.
User-specified mixture variances.
Fitted or user-specified mixture weights.
Variational estimates of posterior mixture assignent probabilities.
Variational estimates of posterior mean coefficients.
Variational estimates of posterior variances.
Local false sign rate (LFSR) for each variable computed from variational estimates of posterior assignment probabilities and posterior means and variances. See Stephens (2017) for a definition of the LFSR.
Variational lower bound to marginal log-likelihood at each iteration of the co-ordinate ascent algorithm.
Maximum difference in the variational posterior probabilities at each iteration of the co-ordinate ascent algorithm.
Number of nonzero mixture components (including the "spike") at each iteration of the co-ordinate ascent algorithm.
n x p input matrix, where n is the number of samples, and p is the number of variables. X cannot be sparse, and cannot have any missing values (NA).
n x m covariate data matrix, where m is the number of
covariates. Do not supply an intercept as a covariate (i.e., a
column of ones), because an intercept is automatically included in
the regression model. For no covariates, set Z = NULL
.
Vector of length n containing values of the continuous outcome.
Vector specifying the prior variance of the regression
coefficients (scaled by sigma
) for each mixture
component. The variance of the first mixture component is the
"spike", and therefore should be exactly zero.
Residual variance parameter. If missing, it is automatically fitted to the data by computing an approximate maximum-likelihood estimate.
If missing, it is automatically fitted to the data by computing an approximate maximum-likelihood estimate.
Initial estimates of the approximate posterior mixture assignment probabilities. These should be specified as a p x K matrix, where K is the number of mixture components. Each row must add up to 1.
Initial estimates of the approximate regression coefficients conditioned on being drawn from each of the K mixture components. These estimates should be provided as a p x K matrix, where K is the number of mixture components.
If TRUE
, sigma is fitted to data using an
approximate EM algorithm, in which case argument sigma
, if
provided, is the initial estimate.
Currently, estimate of mixture component variances is
not implemented, so this must be set to TRUE
, otherwise an
error will be generated.
If TRUE
, mixture weights are fitted using an
approximate EM algorithm, in which case argument w
, if
provided, is the initial estimate.
Penalty term for the mixture weights. It is useful
for "regularizing" the estimate of w
when we do not have a
lot of information. It should be a vector with one positive entry
for each mixture component. Larger values place more weight on the
corresponding mixture components. It is based on the Dirichlet
distribution with parameters w.penalty
. The default is a
vector of ones, which reduces to a uniform prior on w
.
Posterior probability threshold for dropping
mixture components. Should be a positive number close to zero. If,
at any point during the optimization, all posterior mixture
assignment probabilities for a given mixture component k
are
less than drop.threshold
, the mixture weight for component
k
is automatically set to zero. Set drop.threshold
to
zero to disable this behaviour. Setting larger values for
drop.threshold
may improve computation speed at a small cost
to numerical accuracy of the final results.
Convergence tolerance for co-ordinate ascent updates.
Maximum number of co-ordinate ascent iterations.
If verbose = TRUE
, print progress of algorithm
to console.
Peter Carbonetto peter.carbonetto@gmail.com
M. Stephens (2017). False discovery rates: a new deal. Biostatistics 18, 275--294.
varbvs
# Generate the data set.
set.seed(1)
n <- 200
p <- 500
X <- randn(n,p)
sd <- c(0,0.2,0.5)
w <- c(0.9,0.05,0.05)
k <- sample(length(w),p,replace = TRUE,prob = w)
beta <- sd[k] * rnorm(p)
y <- c(X %*% beta + rnorm(n))
# Fit the model to the data.
fit <- varbvsmix(X,NULL,y,sd^2)
if (FALSE) {
library(lattice)
print(xyplot(beta.est ~ beta.true,
data.frame(beta.true = beta,
beta.fitted = rowSums(fit$alpha * fit$mu)),
pch = 20,col = "royalblue",cex = 1))
}
Run the code above in your browser using DataLab