NNS.boost: NNS Boost

Description

Ensemble method for classification using the predictions of the NNS multivariate regression NNS.reg collected from uncorrelated feature combinations.

Usage

NNS.boost(
  IVs.train,
  DV.train,
  IVs.test = NULL,
  type = NULL,
  inference = FALSE,
  depth = NULL,
  learner.trials = 100,
  epochs = NULL,
  CV.size = 0.25,
  balance = FALSE,
  ts.test = NULL,
  folds = 5,
  threshold = NULL,
  obj.fn = expression(sum((predicted - actual)^2)),
  objective = "min",
  extreme = FALSE,
  features.only = FALSE,
  feature.importance = TRUE,
  status = TRUE
)

Arguments

IVs.train

a matrix or data frame of variables of numeric or factor data types.

DV.train

a numeric or factor vector with compatible dimensions to (IVs.train).

IVs.test

a matrix or data frame of variables of numeric or factor data types with compatible dimensions to (IVs.train). If NULL, will use (IVs.train) as default.

type

NULL (default). To perform a classification of discrete integer classes from factor target variable (DV.train) with a base category of 1, set to (type = "CLASS"), else for continuous (DV.train) set to (type = NULL).

inference

logical; FALSE (default) For inferential tasks, otherwise inference = FALSE is faster for predictive tasks.

depth

options: (integer, NULL, "max"); (depth = NULL)(default) Specifies the order parameter in the NNS.reg routine, assigning a number of splits in the regressors, analogous to tree depth.

learner.trials

integer; 100 (default) Sets the number of trials to obtain an accuracy threshold level. If the number of all possible feature combinations is less than selected value, the minimum of the two values will be used.

epochs

integer; 2*length(DV.train) (default) Total number of feature combinations to run.

CV.size

numeric [0, 1]; (CV.size = .25) (default) Sets the cross-validation size. Defaults to 0.25 for a 25 percent random sampling of the training set.

balance

logical; FALSE (default) Uses both up and down sampling from caret to balance the classes. type="CLASS" required.

ts.test

integer; NULL (default) Sets the length of the test set for time-series data; typically 2*h parameter value from NNS.ARMA or double known periods to forecast.

folds

integer; 5 (default) Sets the number of folds in the NNS.stack procedure for optimal n.best parameter.

threshold

numeric; NULL (default) Sets the obj.fn threshold to keep feature combinations.

obj.fn

expression; expression( sum((predicted - actual)^2) ) (default) Sum of squared errors is the default objective function. Any expression() using the specific terms predicted and actual can be used. Automatically selects an accuracy measure when (type = "CLASS").

objective

options: ("min", "max") "max" (default) Select whether to minimize or maximize the objective function obj.fn.

extreme

logical; FALSE (default) Uses the maximum (minimum) threshold obtained from the learner.trials, rather than the upper (lower) quintile level for maximization (minimization) objective.

features.only

logical; FALSE (default) Returns only the final feature loadings along with the final feature frequencies.

feature.importance

logical; TRUE (default) Plots the frequency of features used in the final estimate.

status

logical; TRUE (default) Prints status update message in console.

Value

Returns a vector of fitted values for the dependent variable test set $results, and the final feature loadings $feature.weights, along with final feature frequencies $feature.frequency.

References

Viole, F. (2016) "Classification Using NNS Clustering Analysis" https://www.ssrn.com/abstract=2864711

Examples

Run this code

# NOT RUN {
 ## Using 'iris' dataset where test set [IVs.test] is 'iris' rows 141:150.
 
# }
# NOT RUN {
 a <- NNS.boost(iris[1:140, 1:4], iris[1:140, 5],
 IVs.test = iris[141:150, 1:4],
 epochs = 100, learner.trials = 100,
 type = "CLASS", depth = NULL)

 ## Test accuracy
 mean(a$results == as.numeric(iris[141:150, 5]))
 
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab