path.fmrHP: Finite Mixture Effects Model with Heterogeneity Pursuit

Description

Produce solution paths of regularized finite mixture effects model with lasso or adaptive lasso penalty; compute the degrees of freeom, likelihood and information criteria (AIC, BIC and GIC) of the estimators. Model fitting is conducted by EM algorithm and Bregman coordinate descent.

Usage

path.fmrHP(y, X, m, equal.var = FALSE, 
           ic.type = "ALL", B = NULL, prob = NULL, rho = NULL, 
           control = list(), modstr = list(), report = FALSE)

Arguments

a vector of response (\(n \times 1\))

a matrix of covariate (\(n \times p\))

number of components

equal.var

indicating whether variances of different components are equal

ic.type

the information criterion to be used; currently supporting "AIC", "BIC", and "GIC".

initial values for the rescaled coefficients with first column being the common effect, and the rest m columns being the heterogeneity for corresponding components

prob

initial values for prior probabilitis for different components

rho

initial values for rho vector (\(1 / \sigma\)), the reciprocal of standard deviation

control

a list of parameters for controlling the fitting process

modstr

a list of model parameters controlling the model fitting

report

indicating whether printing the value of objective function during EM algorithm for validation checking of initial value.

Value

A list consisting of

lambda

vector of lambda used in model fitting

lambda.used

vector of lambda in model fitting after truncation by select.ratio

B.hat

estimated rescaled coefficient (\(p \times m + 1 \times nlambda\))

pi.hat

estimated prior probabilities (\(m \times nlambda\))

rho.hat

estimated rho values (\(m \times nlambda\))

values of information criteria

Details

Model parameters can be specified through argument modstr. The available include

lambda: A vector of user-specified lambda values with default NULL.
lambda.min.ratio: Smallest value for lambda, as a fraction of lambda.max, the (data derived) entry value.
nlambda: The number of lambda values.
w: Weight matrix for penalty function. Default option is NULL, which means lasso penailty is used for model fitting.
intercept: Should intercept(s) be fitted (default=TRUE) or set to zero (FALSE).
common.only: A vector of user-specified indicators of the variables only with common effects.
common.no.penalty: A vector of user-specified indicators of the variables with no penalty on the common effect.
cluster.no.penalty: A vector of user-specified indicators of the variables with no penalty on the cluster-specific effects.
select.ratio: A user-specified ratio indicating the ratio of variables to be selected.

The available elements for argument control include

epsilon: Convergence threshold for generalized EM algorithm. Defaults value is 1E-6.
maxit: Maximum number of passes over the data for all lambda values. Default is 1000.
inner.eps: Convergence threshold for Bregman coordinate descent algorithm. Defaults value is 1E-6.
inner.maxit: Maximum number of iteration for Bregman coordinate descent algorithm. Defaults value is 200.
n.ini: Number of initial values for EM algorithm. Default is 10. In EM algorithm, it is preferable to start from several different initial values.

References

Li, Y., Yu, C., Zhao, Y., Yao, W., Aseltine, R. H., & Chen, K. (2021). Pursuing Sources of Heterogeneity in Modeling Clustered Population.

Examples

Run this code

# NOT RUN {
library(fmerPack)
## problem settings
n <- 100; m <- 3; p <- 5;
sigma2 <- c(0.1, 0.1, 0.4); rho <- 1 / sqrt(sigma2)
phi <- rbind(c(1, 1, 1), c(1, 1, 1), c(1, 1, 1), c(-3, 3, 0), c(3, 0, -3))
beta <- t(t(phi) / rho)
## generate response and covariates
z <- rmultinom(n, 1, prob= rep(1 / m, m))
X <- matrix(rnorm(n * p), nrow = n, ncol = p)
y <- MASS::mvrnorm(1, mu = rowSums(t(z) * X[, 1:(nrow(beta))] %*% beta), 
                   Sigma = diag(colSums(z * sigma2)))
## lasso
fit1 <- path.fmrHP(y, X, m = m, modstr = list(nlambda = 10), control = list(n.ini = 1))
## adaptive lasso
fit2 <- path.fmrHP(y, X, m = m, 
                   modstr = list(w = abs(select.tuning(fit1)$B + 1e-6)^2))
# }

Run the code above in your browser using DataLab