Learn R Programming

swaglm (version 0.0.1)

summary.swaglm: summary.swaglm

Description

summary.swaglm

Usage

# S3 method for swaglm
summary(object, ...)

Value

A list of class summary_swaglm with five elements:

mat_selected_model

A matrix where each row represents a selected model. Columns give the indices of the variables included in that model. Models are padded with NA to match the largest model size.

mat_beta_selected_model

A matrix containing the estimated regression coefficients (including intercept) of the selected models, stacked across all dimensions. Only models with AIC less than or equal to the lowest median AIC across all dimensions are included. Each row corresponds to a model, columns correspond to the coefficients. Rows are padded with NA to match the largest model size.

mat_p_value_selected_model

A matrix containing the p-values associated with the estimated regression coefficients in beta_selected_model, stacked in the same order. Each row corresponds to a model, columns correspond to the coefficients. Rows are padded with NA to match the largest model size.

vec_aic_selected_model

A numeric vector containing the AIC values of all models in mat_selected_model, stacked across dimensions. These are the AIC values for the selected models that passed the threshold described above.

lst_estimated_beta_per_variable

A named list where each element corresponds to a variable (named V<index>). Each element is a numeric vector containing all estimated beta coefficients for that variable across all selected models in which it appears. This summarizes the distribution of effects for each variable across the selected models.

lst_p_value_per_variable

A named list where each element corresponds to a variable (named V<index>). Each element is a numeric vector containing all estimated p-values for that variable across all selected models in which it appears.

Model selection criterion:

For each model dimension (number of variables in the model), the median AIC across all models of that dimension is computed. The smallest median AIC across all dimensions is identified. Then, all models with AIC less than or equal to this value are selected. This ensures that only relatively well-performing models across all dimensions are retained for summarization.

Arguments

object

An object of class swaglm.

...

Additional parameters

Examples

Run this code
set.seed(12345)
n <- 2000
p <- 100
# create design matrix and vector of coefficients
Sigma <- diag(rep(1/p, p))
X <- MASS::mvrnorm(n = n, mu = rep(0, p), Sigma = Sigma)
beta = c(-15,-10,5,10,15, rep(0,p-5))

# --------------------- generate from logistic regression with an intercept of one
z <- 1 + X%*%beta
pr <- 1/(1 + exp(-z))
y <- as.factor(rbinom(n, 1, pr))
y = as.numeric(y)-1

# define swag parameters
quantile_alpha = .15
p_max = 20
swaglm_obj = swaglm::swaglm(X=X, y = y, p_max = p_max, family = stats::binomial(),
                          alpha = quantile_alpha, verbose = TRUE, seed = 123)
swaglm_obj
swaglm_summ = summary(swaglm_obj)
swaglm_summ$mat_selected_model
swaglm_summ$mat_beta_selected_model
swaglm_summ$mat_p_value_selected_model
swaglm_summ$vec_aic_selected_model
swaglm_summ$lst_estimated_beta_per_variable


# plot distribution of estimated beta with respect to true value
for(i in seq_along(swaglm_summ$lst_estimated_beta_per_variable)){
  # get variable name

  var_name_i = names(swaglm_summ$lst_estimated_beta_per_variable)[i]
  var_index_i =  as.numeric(substr(var_name_i, 2, nchar(var_name_i)))
  boxplot(swaglm_summ$lst_estimated_beta_per_variable[[i]], 
          main=paste0(var_name_i, " ", "true value=", beta[i]))
}

Run the code above in your browser using DataLab