Last chance! 50% off unlimited learning
Sale ends in
Model selection by a forward / backward-stepping algorithm. The algorithm reduces the degrees of freedom of an existing 'lmvar' object. It searches for the subset of degrees of freedom that results in an optimal goodness-of-fit. This is the subset for which a user-specified function reaches its minimum.
# S3 method for lmvar_no_fit
fwbw(object, fun, fw = FALSE, counter = TRUE,
df_percentage = 0.05, control = list(), ...)
Object of class 'lmvar_no_fit' (hence it can also be of class 'lmvar')
User-specified function which measures the goodness-of-fit. See 'Details'.
Boolean, if TRUE
the search will start with a minimum degrees of freedom ('forward search'). If FALSE
the search will start with the full model ('backward search').
Boolean, if TRUE
and fw = TRUE
, the algorithm will carry out backward steps (attempts to
remove degrees of freedom) while searching for the optimal subset. If FALSE
and fw = TRUE
, the algorithm will only carry out
forward steps (attempts to insert degrees if freedom). The effect of counter
is opposite if fw = FALSE
.
Percentage of degrees of freedom that the algorithm attempts to remove at a backward-step, or insert at a forward-step. Must be a number between 0 and 1.
List of control options. The following options can be set
monitor
Boolean, if TRUE
information about the attempted removals and insertions will be printed during the run.
Default is FALSE
.
plot
Boolean, if TRUE
a plot will be shown at the end of the run. It shows how the value of fun
decreases during the run. Default is FALSE
.
for compatibility with fwbw
generic
A list with the following members.
object
An object of class 'lmvar' which contains the model for which fun
is minimized.
fun
the minimum value of the user-specified function fun
.
The function fwbw
selects the subset of all the degrees of freedom present in object
for which the user-specified function
fun
is minimized. This function is supposed to be a measure for the goodness-of-fit. Typical examples would be
fun=AIC
or fun=BIC
. Another example is where fun
is a measure of the prediction error,
determined by cross-validation or otherwise.
The function fwbw
is intended for situations in which the degrees of freedom in object
is so large that it is not
feasible to go
through all possible subsets systematically to find the smallest value of fun
. Instead, the algorithm generates subsets by removing
degrees of freedom from the current-best subset (a 'backward' step) and reinserting degrees of freedom that were previously removed
(a 'forward' step). Whenever a backward or forward step results in a subset for which fun
is smaller than for the current-best
subset, the new subset becomes current-best.
The start set depends on the argument fw
. If fw = TRUE
, the algorithm starts with only two degrees of freedom: one
for the expected values object
contains them.
If fw = FALSE
(the default), the algorithm starts with all degrees of freedom present in object
.
At a backward step, the model removes degrees of freedom of the current-best subset. It removes at least 1 degree of freeedom
and at most df_percentage
of the degrees in the current-best subset. The degrees
that are removed are the ones with the largest p-value (p-values can be seen with the function
summary.lmvar
). If the removal
results in a larger value of fun
, the algorithm will try again by halving the degrees of freedom it removes.
At a forward step, the algorithm inserts degrees of freedom that are present in
object
but left out in the current-best subset. It inserts at least 1 degree of freedom and at most df_percentage
of
the current-best subset. It inserts those
degees of freedom which are estimated to
increase the likelihood most. If the insertion
results in a larger value of fun
, the algorithm will try again by halving the degrees of freedom it inserts.
If counter = FALSE
, the algorithm is 'greedy': it will only carry out forward-steps in case fw = TRUE
or backward-steps
in case fw = FALSE
.
The algorithm stops if neither the backward nor the forward step resulted in a lower value of fun
. It returns the current-best model
and the minimum value of fun
.
The function fun
must be a function which is a measure for the goodness-of-fit. It must take one argument: an object of class
'lmvar'. Its return value must be a single number. A smaller (more negative) number must represent a better fit.
During the run, a fit to the data is
carried out for each new subset of degrees of freedom. The result of the fit is an object of class 'lmvar'. This object is passed on to
fun
to evaluate the goodness-of-fit. Typical examples for fun
are AIC.lmvar
and BIC
.
When the control
-option monitor
is equal to TRUE
, information is displayed about the progress of the run.
The following information is displayed:
Iteration
A counter which first value is always 0
, followed by 1
. From then on, the counter is increased
whenever the addition or removal of degrees of freedom results in a smaller function value than the smallest so far.
attempted removals/insertions
The number of degrees of freedoms that one attempts to remove or insert
function value
The value of the user-specified function fun
after the removal or insertion of the degrees of freedom
The last column shows the word insert
when the attempt regards the insertion of degrees of freedom. When nothing is shown,
the algorithm attempted to remove degrees of freedom.
If object
was created with intercept_mu = TRUE
, the intercept term for the expected values fwbw.lmvar
. Likewise for intercept_sigma
.
When a new subset of degrees of freedom is generated by either a backward or a forward step, the response vector in object
is fitted to the new model. The fit is carried out by lmvar
. The arguments used in the call to
lmvar
(other than X_mu
and X_sigma
) are the same as used to create object
, except that the control options
mu_full_rank
and sigma_full_rank
are both set to TRUE. Setting them to TRUE can be done safely
because the model matrices object$X_mu
and object$X_sigma
are guaranteed to be full-rank.
fwbw
for the S3 generic method
fwbw.lm
for the corresponding function for an 'lm' object
lmvar
for the constructor of a 'lmvar' object
lmvar_no_fit
for the constructor of a 'lmvar_no_fit' object
The number of degrees of freedom is given by dfree
.
# NOT RUN {
# Generate model matrices
set.seed(1820)
n_rows = 1000
n_cols = 4
X_mu = matrix(sample(-9:9, n_rows * n_cols, replace = TRUE), nrow = n_rows, ncol = n_cols)
X_sigma = matrix(sample(-9:9, n_rows * n_cols, replace = TRUE), nrow = n_rows, ncol = n_cols)
column_names = sapply(1:n_cols, function(i_column){paste("column", i_column, sep = "_")})
colnames(X_mu) = column_names
colnames(X_sigma) = paste(column_names, "_s", sep = "")
# Generate betas
beta_mu = sample(c(-1,-0.5, 0.5, 1), n_cols + 1, replace = TRUE)
beta_sigma = sample(c(-1,-0.5, 0.5, 1), n_cols + 1, replace = TRUE)
# Generate response vector
mu = X_mu %*% beta_mu[-1] + beta_mu[1]
log_sigma = X_sigma %*% beta_sigma[-1] + beta_sigma[1]
y = rnorm( n_rows, mean = mu, sd = exp(log_sigma))
# Add columns for cross-terms to model matrices. They have no predictive power for the response y.
X_mu = model.matrix(~ . + 0 + column_1 * ., data = as.data.frame(X_mu))
X_sigma = model.matrix(~ . + 0 , data = as.data.frame(X_sigma))
c( colnames(X_mu), colnames(X_sigma))
# Create lmvar object
fit = lmvar(y, X_mu, X_sigma)
# Check whether backward- / forward step model selection with BIC as criterion manages
# to remove cross-terms
fwbw = fwbw(fit, BIC, control = list(monitor = TRUE))
names(coef(fwbw$object))
# The same with AIC as criterion
fwbw = fwbw(fit, AIC, control = list(monitor = TRUE))
names(coef(fwbw$object))
# Model selection starting with an intercept term only.
fwbw = fwbw(fit, BIC, fw = TRUE)
names(coef(fwbw$object))
# It also works on an object of class 'lmvar_no_fit'
no_fit = lmvar_no_fit(y, X_mu, X_sigma)
fwbw( no_fit, AIC, control = list(monitor = TRUE))
# }
Run the code above in your browser using DataLab