famos: Automated Model Selection

Description

Given a vector containing all parameters of interest and a cost function, the FAMoS looks for the most appropriate subset model to describe the given data.

Usage

famos(init.par, fit.fn, nr.of.data, homedir = getwd(),
  do.not.fit = NULL, method = "forward", init.model.type = "random",
  refit = FALSE, optim.runs = 5, information.criterion = "AICc",
  default.val = NULL, swap.parameters = NULL,
  critical.parameters = NULL, random.borders = 1,
  control.optim = list(maxit = 1000), parscale.pars = FALSE,
  con.tol = 0.01, save.performance = TRUE, future.off = FALSE,
  log.interval = 600, ...)

Arguments

init.par

A named vector containing the initial parameter values.

fit.fn

A cost function. Has to take the complete parameter vector as an input (needs to be names parms) and must return the corresponding negative log-likelihood (-2LL, see Burnham and Anderson 2002). The binary vector, containing the information which parameters are currently fitted, can also be used by taking binary as an additional function input argument.

nr.of.data

The number of data points used for fitting.

homedir

The directory to which the results should be saved to.

do.not.fit

The names of the parameters that are not supposed to be fitted. Default is NULL.

method

The starting method of the FAMoS. Options are "forward" (forward search), "backward" (backward elimination) and "swap" (only if critical.parameters or swap.parameters are supplied). Methods are adaptively changed over each iteration of the FAMoS. Default to "forward".

init.model.type

The starting model. Options are "global" (starts with the complete model) or "random" (creates a randomly sampled starting model). Alternatively, a specific model can be used by giving the corresponding names of the parameters one wants to start with. Default to "random".

refit

If TRUE, previously tested models will be tested again. Default to FALSE.

optim.runs

The number of times that each model will be fitted by optim. Default to 5.

information.criterion

The information criterion the model selection will be based on. Options are "AICc", "AIC" and "BIC". Default to "AICc".

default.val

A named list containing the values that the non-fitted parameters should take. If NULL, all non-fitted parameters will be set to zero. Default values can be either given by a numeric value or by the name of the corresponding parameter the value should be inherited from (NOTE: In this case the corresponding parameter entry has to contain a numeric value). Default to NULL.

swap.parameters

A list specifying which parameters are interchangeable. Each swap set is given as a vector containing the names of the respective parameters. Default to NULL.

critical.parameters

A list specifying sets of critical parameters. Critical sets are parameters sets, of which at least one parameter per set has to be present in each tested model. Default to NULL.

random.borders

The ranges from which the random initial parameter conditions for all optim.runs larger than one are sampled. Can be either given as a vector containing the relative deviations for all parameters or as a matrix containing in its first column the lower and in its second column the upper border values. Parameters are uniformly sampled based on runif. Default to 1 (100% deviation of all parameters). Alternatively, functions such as rnorm, rchisq, etc. can be used if the additional arguments are passed along as well.

control.optim

Control parameters passed along to optim. For more details, see optim.

parscale.pars

Logical. If TRUE, the parscale option will be used when fitting with optim. This can help to speed up the fitting procedure, if the parameter values are on different scales. Default to FALSE.

con.tol

The relative convergence tolerance. famos will rerun optim until the relative improvement between the current and the last fit is less than con.tol. Default is set to 0.01, meaning the fitting will terminate if the improvement is less than 1% of the previous value.

save.performance

Logical. If TRUE, the performance of FAMoS will be evaluated in each iteration via famos.performance, which will save the corresponding plot into the folder "FAMoS-Results/Figures/" (starting from iteration 3) and simultaneously show it on screen. Default to TRUE.

future.off

Logical. If TRUE, FAMoS runs without the use of futures. Useful for debugging.

log.interval

The interval (in seconds) at which FAMoS informs about the current status, i.e. which models are still running and how much time has passed. Default to 600 (= 10 minutes).

...

Other arguments that will be passed along to future, optim or the user-specified cost function fit.fn.

Value

A list containing the following elements:

IC: The value of the information criterion of the best model.
par: The values of the fitted parameter vector corresponding to the best model.
IC.type: The type of information criterion used.
binary: The binary information of the best model.
vector: Vector indicating which parameters were fitted in the best model.
total.models.tested: The total number of different models that were analysed. May include repeats.
mrun: The number of the current FAMoS run.
initial.model: The first model evaluated by the FAMoS run.

Details

In each iteration, the FAMoS finds all neighbouring models based on the current model and method, and subsequently tests them. If one of the tested models performs better than the current model, the model, but not the method, will be updated. Otherwise, the method, but not the model, will be adaptively changed, depending on the previously used methods.

Examples

Run this code

# NOT RUN {
future::plan(future::sequential)

#setting data
x.values <- 1:7
y.values <-  3^2 * x.values^2 - exp(2 * x.values)

#define initial conditions and corresponding test function
inits <- c(p1 = 3, p2 = 4, p3 = -2, p4 = 2, p5 = 0)

cost_function <- function(parms, x.vals, y.vals){
 if(max(abs(parms)) > 5){
   return(NA)
 }
 with(as.list(c(parms)), {
   res <- 4*p1 + p2^2*x.vals^2 + p3*sin(x.vals) + p4*x.vals - exp(p5*x.vals)
   diff <- sum((res - y.vals)^2)
 })
}

#set swap set
swaps <- list(c("p1", "p5"))

#perform model selection
res <- famos(init.par = inits,
            fit.fn = cost_function,
            nr.of.data = length(y.values),
            homedir = getwd(),
            method = "swap",
            swap.parameters = swaps,
            init.model.type = c("p1", "p3"),
            optim.runs = 1,
            x.vals = x.values,
            y.vals = y.values)
# }

Run the code above in your browser using DataLab