diagnose_influence: Diagnose Influential Points in Linear Models (Cook's Distance)
Description
Fits a linear model between two variables and calculates Cook's Distance to identify
influential points. An influential point is an outlier that specifically affects
the slope of the regression line.
Character. The name of the dependent variable (Y).
predictor
Character. The name of the independent variable (X).
cutoff
Numeric (Optional). A custom threshold for Cook's Distance.
If NULL, it defaults to 4/n.
Details
Cook's distance (\(D_i\)) measures the effect of deleting a given observation.
Points with a large \(D_i\) are considered to have high leverage and influence.
The default threshold for detection is calculated as:
$$Threshold = \frac{4}{n}$$
Where \(n\) is the number of observations. This is a standard rule of thumb
in regression diagnostics.