stdCoeff
will calculate fully standardised coefficients in
standard deviation units for a fitted model or list of models. It achieves
this via adjusting the 'raw' model coefficients, so no standardisation of
input variables is required beforehand. Users can simply specify the model
with all variables in their original units and the function will do the
rest. However, the user is free to scale and/or centre any input variables
should they choose, which should not affect the outcome of standardisation
(provided any scaling is by standard deviations). This may be desirable in
some cases, such as to increase numerical stability during model fitting
when variables are on widely different scales.
If arguments cen.x
or cen.y
are TRUE
, model estimates
will be calculated as if all predictors (x) and/or the response variable
(y) were mean-centred prior to model-fitting (including any dummy variables
arising from categorical predictors). Thus, for an ordinary linear model
where centring of x and y is specified, the intercept will be zero - the
mean (or weighted mean) of y. In addition, if cen.x = TRUE
and there
are interacting terms in the model, all coefficients for lower order terms
of the interaction are adjusted using an expression which ensures that each
main effect or lower order term is estimated at the mean values of the
terms they interact with (zero in a 'centred' model) - typically improving
the interpretation of coefficients. The expression used comprises a
weighted sum of all the coefficients that contain the lower order term,
with the weight for the term itself being zero and those for 'containing'
terms being the product of the means of the other variables involved in
that term (i.e. those not in the lower order term itself). For example, for
a three-way interaction (x1 * x2 * x3), the expression for main effect
\(\beta1\) would be:
$$\beta_{1} + \beta_{12} \bar{x}_{2} + \beta_{13} \bar{x}_{3} +
\beta_{123} \bar{x}_{2} \bar{x}_{3}$$ (adapted from
here)
In addition, if std.x = TRUE
or unique.x = TRUE
(see below),
product terms for interactive effects will be recalculated using
mean-centred variables, to ensure that standard deviations and variance
inflation factors (VIF) for predictors are calculated correctly (the model
must be re-fit for this latter purpose, to recalculate the
variance-covariance matrix).
If std.x = TRUE
, coefficients are standardised by multiplying by the
standard deviations of predictor variables (or terms), while if std.y
= TRUE
they are divided by the standard deviation of the response. If the
model is a GLM, this latter is calculated using the link-transformed
response (or an estimate of same) generated using the function getY
.
If both arguments are true, the coefficients are regarded as 'fully'
standardised in the traditional sense, often referred to as 'betas'.
If unique.x = TRUE
(default), coefficients are adjusted for
multicollinearity among predictors by dividing by the square root of the
VIFs (Dudgeon 2016, Thompson et al. 2017). If they have also been
standardised by the standard deviations of x and y, this converts them to
semipartial correlations, i.e. the correlation between the unique
components of predictors (residualised on other predictors) and the
response variable. This measure of effect size is arguably much more
interpretable and useful than the traditional standardised coefficient, as
it is always estimated independent of other predictors and so can more
readily be compared both within and across models. Values range from zero
to +/-1 rather than +/- infinity (as in the case of betas) - putting them
on the same scale as the bivariate correlation between predictor and
response. In the case of GLMs however, the measure is analogous but not
exactly equal to the semipartial correlation, so its values may not always
be bound between +/-1 (such cases are likely rare). Crucially, for ordinary
linear models, the square of the semipartial correlation equals the
increase in R-squared when that variable is added last in the model -
directly linking the measure to model fit and 'variance explained'. See
here
for additional arguments in favour of the use of semipartial correlations.
If refit.x = TRUE
, the model will be re-fit with any (newly-)centred
continuous predictors. This will occur (and will normally be desired) when
cen.x
and unique.x
are TRUE
and there are interaction
terms in the model, in order to calculate correct VIFs from the var-cov
matrix. However, re-fitting may not be necessary in some cases - for
example where predictors have already been centred (and whose values will
not subsequently be resampled during bootstrapping) - and disabling this
option may save time with larger models and/or bootstrap runs.
If r.squared = TRUE
, R-squared values are also returned via the
R2
function.
Finally, if weights
are specified, the function calculates a
weighted average of the standardised coefficients across models (Burnham &
Anderson 2002).