CookDistance allows the user to identify those subjects with a greater influence in the predicted values or in the estimation of the
fixed effects for the treatment group, based in the calculation of Cook's distances.
CookDistance(
model,
type = "fitted",
cook_thr = NA,
label_angle = 0,
maxIter = 1000,
verbose = TRUE
)A plot of the Cook's distance value for each subject, indicating those subjects
whose Cook's distance is greater than cook_thr.
If saved to a variable, the function returns a vector with the Cook's distances for each subject.
An object of class "lme" representing the linear mixed-effects model fitted by lmmModel().
Type of Cook's distance to calculated. Possible options are fitted, to calculte Cook's distances
based on the change in fitted values, or fixef to calculate Cook's distances based on the change in the fixed effects.
See Details section for more information.
Numeric value indicating the threshold for the Cook's distance. If not specified, the threshold is set to three times the mean of the Cook's distance values.
Numeric value indicating the angle for the label of subjects with a Cook's distance greater than cook_thr.
Limit of maximum number of iterations for the optimization algorithm. Default to 1000.
Logical indicating if the subjects with a Cook's distance greater than cook_thr should be printed to the console.
The identification of influential subjects is based on the calculation of Cook's distances. The Cook's distances can be calculated based on the change in fitted values or fixed effects.
Cook's distances based on the change in fitted values
When type = "fitted", the Cook's distances
are calculated as the normalized change in fitted response values due to the removal of a subject from the model.
Firts, a leave-one-subject-out model is fitted, removing individually each subject to fit the model. Then, the Cook's
distance for subject \(i\), (\(D_i\)), is calculated as:
$$D_i=\frac{\sum_{j=1}^n\Bigl(\hat{y}_{j}-\hat{y}_{j_{(-i)}}\Bigl)^2}{p\cdot MSE}$$
where \(\hat{y}_j\) is the \(j^{th}\) fitted response value using the complete model, and \(\hat{y}_{j_{(-i)}}\) is the \(j^{th}\) fitted response value obtained using the model where subject \(i\) has been removed.
The denominator of the expression is equal to the number of the fixed-effects coefficients, \(p\), which, under the assumption that the design matrix is of full rank, is equivalent to the rank of the design matrix, and the Cook distance is normalized by the mean square error (\(MSE\)) of the model.
Cook's distances based on the change in fixed effects values
The identification of the subjects with a greater influence in the estimated fixed effects is based on the calculation of Cook's distances, as described in Gałecki and Burzykowsk (2013). To compute the Cook's distance for the fixed effect estimates (i.e., the contribution to each subject to the coefficients of its treatment group), first a matrix containing the leave-one-subject-out estimates or the fixed effects is calculated. Then, the Cook's distances are calculated according to:
$$D_i \equiv \frac{(\hat{\beta} - \hat{\beta}_{(-i)})[\widehat{Var(\hat{\beta})}]^{-1}(\hat{\beta} - \hat{\beta}_{(-i)})}{p}$$
where \(\beta\) represents the vector of fixed effects and \(\hat{\beta}_{(-i)}\) is the estimate of the parameter vector \(\beta\) obtained by fitting the model to the data with the \(i\)-th subject excluded. The denominator of the expression is equal to the number of the fixed-effects coefficients, \(p\), which, under the assumption that the design matrix is of full rank, is equivalent to the rank of the design matrix.
Andrzej Galecki & Tomasz Burzykowski (2013) Linear Mixed-Effects Models Using R: A Step-by-Step Approach First Edition. Springer, New York. ISBN 978-1-4614-3899-1
#' # Load the example data
data(grwth_data)
# Fit the model
lmm <- lmmModel(
data = grwth_data,
sample_id = "subject",
time = "Time",
treatment = "Treatment",
tumor_vol = "TumorVolume",
trt_control = "Control",
drug_a = "DrugA",
drug_b = "DrugB",
combination = "Combination"
)
# Calulate Cook's distances for each subject
CookDistance(model = lmm)
# Change the Cook's distance threshold
CookDistance(model = lmm, cook_thr = 0.15)
Run the code above in your browser using DataLab