Ensemble method for classification using the predictions of the NNS multivariate regression NNS.reg collected from uncorrelated feature combinations.
NNS.boost(
IVs.train,
DV.train,
IVs.test = NULL,
type = NULL,
representative.sample = FALSE,
depth = "max",
n.best = NULL,
learner.trials = 100,
epochs = NULL,
CV.size = 0.25,
balance = FALSE,
ts.test = NULL,
folds = 5,
threshold = NULL,
obj.fn = expression(sum((predicted - actual)^2)),
objective = "min",
extreme = FALSE,
feature.importance = TRUE,
status = TRUE,
ncores = NULL
)
a matrix or data frame of variables of numeric or factor data types.
a numeric or factor vector with compatible dimensions to (IVs.train)
.
a matrix or data frame of variables of numeric or factor data types with compatible dimensions to (IVs.train)
. If NULL, will use (IVs.train)
as default.
NULL
(default). To perform a classification of discrete integer classes from factor target variable (DV.train)
, set to (type = "CLASS")
, else for continuous (DV.train)
set to (type = NULL)
.
logical; FALSE
(default) Reduces observations of IVs.train
to a set of representative observations per regressor.
options: (integer, NULL, "max"); Specifies the order
parameter in the NNS.reg routine, assigning a number of splits in the regressors. (depth = "max")
(default) will be significantly faster, but increase the variance of results, which is suggested for mixed continuous and discrete (unordered, ordered) data.
integer; NULL
(default) Sets the number of nearest regression points to use in weighting for multivariate regression at sqrt(# of regressors)
. Analogous to k
in a k Nearest Neighbors
algorithm. If NULL
, determines the optimal clusters via the NNS.stack procedure.
integer; NULL
(default) Sets the number of trials to obtain an accuracy threshold
level. (learner.trials = 100)
is the default setting.
integer; 2*length(DV.train)
(default) Total number of feature combinations to run.
numeric [0, 1]; (CV.size = .25)
(default) Sets the cross-validation size. Defaults to 0.25 for a 25 percent random sampling of the training set.
logical; FALSE
(default) Uses both up and down sampling from caret
to balance the classes. type="CLASS"
required.
integer; NULL (default) Sets the length of the test set for time-series data; typically 2*h
parameter value from NNS.ARMA or double known periods to forecast.
integer; 5 (default) Sets the number of folds
in the NNS.stack procedure for optimal n.best
parameter.
numeric; NULL
(default) Sets the obj.fn
threshold to keep feature combinations.
expression;
expression( sum((predicted - actual)^2) )
(default) Sum of squared errors is the default objective function. Any expression()
using the specific terms predicted
and actual
can be used. Automatically selects an accuracy measure when (type = "CLASS")
.
options: ("min", "max") "max"
(default) Select whether to minimize or maximize the objective function obj.fn
.
logical; FALSE
(default) Uses the maximum (minimum) threshold
obtained from the learner.trials
, rather than the upper (lower) quintile level for maximization (minimization) objective
.
logical; TRUE
(default) Plots the frequency of features used in the final estimate.
logical; TRUE
(default) Prints status update message in console.
integer; value specifying the number of cores to be used in the parallelized procedure. If NULL (default), the number of cores to be used is equal to the number of cores of the machine - 1.
Returns a vector of fitted values for the dependent variable test set $results
, and the final feature loadings $feature.weights
.
Viole, F. (2016) "Classification Using NNS Clustering Analysis" https://www.ssrn.com/abstract=2864711
# NOT RUN {
## Using 'iris' dataset where test set [IVs.test] is 'iris' rows 141:150.
# }
# NOT RUN {
a <- NNS.boost(iris[1:140, 1:4], iris[1:140, 5],
IVs.test = iris[141:150, 1:4],
epochs = 100, learner.trials = 100,
type = "CLASS")
## Test accuracy
mean(a$results == as.numeric(iris[141:150, 5]))
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab