robust_covariance_gv: Robust Covariance Estimation Based on Geometric Variability

Description

Computes a robust covariance matrix for a weighted dataset by selecting the most central subset of observations according to geometric variability. Observations are ranked based on a proximity function measuring how far each individual is from the rest of the data. The most central subset is then used to compute a covariance matrix.

Usage

robust_covariance_gv(X, w, alpha)

Value

A list containing:

S: Robust covariance matrix of dimension p x p.
central_idx: Indices of observations selected as the central subset.
outlier_idx: Indices of observations considered outliers.
phi: Proximity function values for all observations.
q: Threshold value used for trimming (quantile of phi).

Arguments

X: Numeric matrix of dimension n x p, where n is the number of observations and p is the number of variables.
w: Numeric vector of weights of length n. Weights will be normalized to sum to 1.
alpha: Numeric trimming proportion between 0 and 1 (e.g., 0.05, 0.10, 0.15) indicating the fraction of most extreme observations to discard.

Examples

Run this code

# Load a small subset of the example dataset
data("Data_HC_contamination", package = "dbrobust")
Data_small <- Data_HC_contamination[1:20, ]

# Select only continuous variables
cont_vars <- names(Data_small)[1:4]
Data_cont <- Data_small[, cont_vars]

# Set uniform weights and trimming proportion
weights <- rep(1, nrow(Data_cont))
alpha <- 0.10

# Compute robust covariance with trimming
res <- dbrobust::robust_covariance_gv(Data_cont, weights, alpha)

# Inspect results: central observations, outliers, covariance, threshold, proximity
res$central_idx
res$outlier_idx
round(res$S, 4)
res$q
round(res$phi[1:10], 4)

Run the code above in your browser using DataLab