safs_initial
Ancillary simulated annealing functions
Builtin functions related to simulated annealing
Usage
safs_initial(vars, prob = 0.2, ...)
safs_perturb(x, vars, number = floor(vars*.01) + 1)
safs_prob(old, new, iteration = 1)
caretSA
rfSA
treebagSA
Arguments
 vars
 the total number of possible predictor variables
 prob
 The probability that an individual predictor is included in the initial predictor set
 x
 the integer index vector for the current subset
 old, new
 fitness values associated with the current and new subset
 iteration
 the number of iterations overall or the number of iterations since restart (if
improve
is used insafsControl
)  number
 the number of predictor variables to perturb
 ...
 not currently used
Details
These functions are used with the functions
argument of the safsControl
function. More information on the details of these functions are at http://topepo.github.io/caret/SA.html.
The initial
function is used to create the first predictor subset. The function safs_initial
randomly selects 20% of the predictors. Note that, instead of a function, safs
can also accept a vector of column numbers as the initial subset.
safs_perturb
is an example of the operation that changes the subset configuration at the start of each new iteration. By default, it will change roughly 1% of the variables in the current subset.
The prob
function defines the acceptance probability at each iteration, given the old and new fitness (i.e. energy values). It assumes that smaller values are better. The default probability function computed the percentage difference between the current and new fitness value and using an exponential function to compute a probability:
prob = exp[(currentnew)/current*iteration]
Value

The return value depends on the function. Note that the SA code encodes the subsets as a vector of integers that are included in the subset (which is different than the encoding used for GAs).The objects
caretSA
, rfSA
and treebagSA
are example lists that can be used with the functions
argument of safsControl
.In the case of caretSA
, the ...
structure of safs
passes through to the model fitting routine. As a consequence, the train
function can easily be accessed by passing important arguments belonging to train
to safs
. See the examples below. By default, using caretSA
will used the resampled performance estimates produced by train
as the internal estimate of fitness.For rfSA
and treebagSA
, the randomForest
and bagging
functions are used directly (i.e. train
is not used). Arguments to either of these functions can also be passed to them though the safs
call (see examples below). For these two functions, the internal fitness is estimated using the outofbag estimates naturally produced by those functions. While faster, this limits the user to accuracy or Kappa (for classification) and RMSE and Rsquared (for regression).References
See Also
Examples
selected_vars < safs_initial(vars = 10 , prob = 0.2)
selected_vars
###
safs_perturb(selected_vars, vars = 10, number = 1)
###
safs_prob(old = .8, new = .9, iteration = 1)
safs_prob(old = .5, new = .6, iteration = 1)
grid < expand.grid(old = c(4, 3.5),
new = c(4.5, 4, 3.5) + 1,
iter = 1:40)
grid < subset(grid, old < new)
grid$prob < apply(grid, 1,
function(x)
safs_prob(new = x["new"],
old= x["old"],
iteration = x["iter"]))
grid$Difference < factor(grid$new  grid$old)
grid$Group < factor(paste("Current Value", grid$old))
ggplot(grid, aes(x = iter, y = prob, color = Difference)) +
geom_line() + facet_wrap(~Group) + theme_bw() +
ylab("Probability") + xlab("Iteration")
## Not run:
# ###
# ## Hypothetical examples
# lda_sa < safs(x = predictors,
# y = classes,
# safsControl = safsControl(functions = caretSA),
# ## now pass arguments to `train`
# method = "lda",
# metric = "Accuracy"
# trControl = trainControl(method = "cv", classProbs = TRUE))
#
# rf_sa < safs(x = predictors,
# y = classes,
# safsControl = safsControl(functions = rfSA),
# ## these are arguments to `randomForest`
# ntree = 1000,
# importance = TRUE)
# ## End(Not run)