# sbf

##### Selection By Filtering (SBF)

Model fitting after applying univariate filters

- Keywords
- models

##### Usage

`sbf(x, ...)`## S3 method for class 'default':
sbf(x, y, sbfControl = sbfControl(), ...)

## S3 method for class 'formula':
sbf(form, data, ..., subset, na.action, contrasts = NULL)

## S3 method for class 'sbf':
predict(object, newdata = NULL, ...)

##### Arguments

- x
- a data frame containing training data where samples are in rows and features are in columns.
- y
- a numeric or factor vector containing the outcome for each sample.
- form
- A formula of the form
`y ~ x1 + x2 + ...`

- data
- Data frame from which variables specified in
`formula`

are preferentially to be taken. - subset
- An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
- na.action
- A function to specify the action to be taken if NAs are found. The default action is for the procedure to fail. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must
- contrasts
- a list of contrasts to be used for some or all of the factors appearing as variables in the model formula.
- sbfControl
- a list of values that define how this function acts. See
`sbfControl`

. (NOTE: If given, this argument must be named.) - object
- an object of class
`sbf`

- newdata
- a matrix or data frame of predictors. The object must have non-null column names
- ...
- for
`sbf`

: arguments passed to the classification or regression routine (such as`randomForest`

). For`predict.sbf`

: augments cannot be passed to the prediction function

##### Details

This function can be used to get resampling estimates for models when simple, filter-based feature selection is applied to the training data.

For each iteration of resampling, the predictor variables are univariately filtered prior to modeling. Performance of this approach is estimated using resampling. The same filter and model are then applied to the entire training set and the final model (and final features) are saved.

The modeling and filtering techniques are specified in `sbfControl`

. Example functions are given in `lmSBF`

.

##### Value

- for
`sbf`

, an object of class`sbf`

with elements: pred if `sbfControl$saveDetails`

is`TRUE`

, this is a list of predictions for the hold-out samples at each resampling iteration. Otherwise it is`NULL`

variables a list of variable names that survived the filter at each resampling iteration results a data frame of results aggregated over the resamples fit the final model fit with only the filtered variables optVariables the names of the variables that survived the filter using the training set call the function call control the control object resample if `sbfControl$returnResamp`

is "all", a data frame of the resampled performance measures. Otherwise,`NULL`

metrics a character vector of names of the performance measures dots a list of optional arguments that were passed in - For
`predict.sbf`

, a vector of predictions.

##### See Also

##### Examples

```
data(BloodBrain)
## Use a GAM is the filter, then fit a random forest model
RFwithGAM <- sbf(bbbDescr, logBBB,
sbfControl = sbfControl(functions = rfSBF,
verbose = FALSE,
method = "cv"))
RFwithGAM
predict(RFwithGAM, bbbDescr[1:10,])
```

*Documentation reproduced from package caret, version 4.62, License: GPL-2*