cggm_refit: Refit the Gaussian Graphical Model for a Given Aggregation and Sparsity Structure

Description

Estimate the parameters of a clustered and sparse precision matrix or covariance matrix based on a restricted negative log-likelihood loss function. The restrictions are given by the provided aggregation and sparsity structure. This function is different from cggm(), as there are no aggregation and sparsity penalties on the precision or covariance matrix.

Usage

cggm_refit(cggm_output, verbose = 0)

Value

An object of class "CGGM_refit" with the following components:

A,R: Lists of matrices. Each pair of matrices with the same index parametrize the estimated precision matrix after the refitting step given the aggregation structure found with the corresponding value of the aggregation parameter lambda_cpath (and sparsity structure found with the value of the sparsity parameter lambda_lasso). It is not recommended to use these directly, instead use the accessor function get_Theta() to extract the estimated precision matrix for a given index of the aggregation parameter.
clusters: An integer matrix in which each row contains the cluster assignment of each variable for the corresponding value of the aggregation parameter lambda_cpath. Use the accessor function get_clusters() to extract the cluster assignment for a given index of the aggregation parameter.
lambdas: A vector with the values for the aggregation parameter lambda_cpath for which the CGGM loss function has been minimized.
Theta: List of matrices. Contains the solution to the minimization procedure for each value of the aggregation parameter lambda_cpath. It is not recommended to use these directly, instead use the accessor function get_Theta() to extract the estimated precision matrix for a given index of the aggregation parameter.
cluster_counts: An integer vector containing the number of clusters obtained for each value of the aggregation parameter lambda_cpath.
cluster_solution_index: An integer vector containing the index of the value of the aggregation parameter lambda_cpath for which a certain number of clusters was attained. For example, cluster_solution_index[2] yields the index of the smallest value for lambda_cpath for which a solution with two clusters was found. Contains -1 if there is no value for lambda_cpath with that number of clusters.
n: The number of values of the aggregation parameter lambda_cpath for which the CGGM loss function was minimized.

Arguments

cggm_output: An object of class "CGGM" as returned by cggm().
verbose: Determines the amount of information printed during the optimization. Defaults to 0.

Author

Daniel J.W. Touw

References

D.J.W. Touw, A. Alfons, P.J.F. Groenen and I. Wilms (2025) Clusterpath Gaussian Graphical Modeling. arXiv:2407.00644. doi:10.48550/arXiv.2407.00644.

Examples

Run this code

# Generate data
set.seed(3)
Theta <- matrix(
  c(2, 1, 0, 0,
    1, 2, 0, 0,
    0, 0, 4, 1,
    0, 0, 1, 4),
  nrow = 4
)
X <- mvtnorm::rmvnorm(n = 100, sigma = solve(Theta))

# Estimate the covariance matrix
S <- cov(X)

# Compute the weight matrix for the clusterpath (clustering) weights
W_cpath <- clusterpath_weights(S, phi = 1, k = 2)

# Compute the weight matrix for the lasso (sparsity) weights
W_lasso <- lasso_weights(S)

# Set values to be used for the aggregation parameter
lambdas <- seq(0, 0.2, by = 0.01)

# Estimate the precision matrix while automatically expanding
# the sequence of values for the aggregation parameter
fit <- cggm(S, W_cpath = W_cpath, lambda_cpath = lambdas,
            W_lasso = W_lasso, lambda_lasso = 0.2,
            expand = TRUE)

# Apply the refitting step to the results, estimating the
# precision matrix based on the clustering and sparsity
# patterns but without additional shrinkage
refit <- cggm_refit(fit)

# A solution with 2 clusters
keep <- refit$cluster_solution_index[2]
get_Theta(refit, index = keep)
get_clusters(refit, index = keep)

Run the code above in your browser using DataLab