NNS.stack: NNS Stack

Description

Prediction model using the predictions of the NNS base models NNS.reg as features (i.e. meta-features) for the stacked model.

Usage

NNS.stack(
  IVs.train,
  DV.train,
  IVs.test = NULL,
  type = NULL,
  obj.fn = expression(sum((predicted - actual)^2)),
  objective = "min",
  dist = "L2",
  CV.size = NULL,
  balance = FALSE,
  ts.test = NULL,
  folds = 5,
  order = NULL,
  norm = NULL,
  method = c(1, 2),
  stack = TRUE,
  dim.red.method = "cor",
  status = TRUE,
  ncores = NULL
)

Arguments

IVs.train

a vector, matrix or data frame of variables of numeric or factor data types.

DV.train

a numeric or factor vector with compatible dimensions to (IVs.train).

IVs.test

a vector, matrix or data frame of variables of numeric or factor data types with compatible dimensions to (IVs.train). If NULL, will use (IVs.train) as default.

type

NULL (default). To perform a classification of discrete integer classes from factor target variable (DV.train), set to (type = "CLASS"), else for continuous (DV.train) set to (type = NULL). Like a logistic regression, this setting is not necessary for target variable of two classes e.g. [0, 1].

obj.fn

expression; expression(sum((predicted - actual)^2)) (default) Sum of squared errors is the default objective function. Any expression() using the specific terms predicted and actual can be used.

objective

options: ("min", "max") "min" (default) Select whether to minimize or maximize the objective function obj.fn.

dist

options:("L1", "L2", "DTW", "FACTOR") the method of distance calculation; Selects the distance calculation used. dist = "L2" (default) selects the Euclidean distance and (dist = "L1") selects the Manhattan distance; (dist = "DTW") selects the dynamic time warping distance; (dist = "FACTOR") uses a frequency.

CV.size

numeric [0, 1]; NULL (default) Sets the cross-validation size if (IVs.test = NULL). Defaults to 0.25 for a 25 percent random sampling of the training set under (CV.size = NULL).

balance

logical; FALSE (default) Uses both up and down sampling from caret to balance the classes. type="CLASS" required.

ts.test

integer; NULL (default) Sets the length of the test set for time-series data; typically 2*h parameter value from NNS.ARMA or double known periods to forecast.

folds

integer; folds = 5 (default) Select the number of cross-validation folds.

order

options: (integer, "max", NULL); NULL (default) Sets the order for NNS.reg, where (order = "max") is the k-nearest neighbors equivalent, which is suggested for mixed continuous and discrete (unordered, ordered) data.

norm

options: ("std", "NNS", NULL); NULL (default) 3 settings offered: NULL, "std", and "NNS". Selects the norm parameter in NNS.reg.

method

numeric options: (1, 2); Select the NNS method to include in stack. (method = 1) selects NNS.reg; (method = 2) selects NNS.reg dimension reduction regression. Defaults to method = c(1, 2), which will reduce the dimension first, then find the optimal n.best.

stack

logical; TRUE (default) Uses dimension reduction output in n.best optimization, otherwise performs both analyses independently.

dim.red.method

options: ("cor", "NNS.dep", "NNS.caus", "all") method for determining synthetic X* coefficients. (dim.red.method = "cor") (default) uses standard linear correlation for weights. (dim.red.method = "NNS.dep") uses NNS.dep for nonlinear dependence weights, while (dim.red.method = "NNS.caus") uses NNS.caus for causal weights. (dim.red.method = "all") averages all methods for further feature engineering.

status

logical; TRUE (default) Prints status update message in console.

ncores

integer; value specifying the number of cores to be used in the parallelized subroutine NNS.reg. If NULL (default), the number of cores to be used is equal to the number of cores of the machine - 1.

Value

Returns a vector of fitted values for the dependent variable test set for all models.

"NNS.reg.n.best" returns the optimum "n.best" parameter for the NNS.reg multivariate regression. "SSE.reg" returns the SSE for the NNS.reg multivariate regression.
"OBJfn.reg" returns the obj.fn for the NNS.reg regression.
"NNS.dim.red.threshold" returns the optimum "threshold" from the NNS.reg dimension reduction regression.
"OBJfn.dim.red" returns the obj.fn for the NNS.reg dimension reduction regression.
"reg" returns NNS.reg output.
"dim.red" returns NNS.reg dimension reduction regression output.
"stack" returns the output of the stacked model.

References

Viole, F. (2016) "Classification Using NNS Clustering Analysis" https://www.ssrn.com/abstract=2864711

Examples

Run this code

# NOT RUN {
 ## Using 'iris' dataset where test set [IVs.test] is 'iris' rows 141:150.
 
# }
# NOT RUN {
 NNS.stack(iris[1:140, 1:4], iris[1:140, 5], IVs.test = iris[141:150, 1:4], type = "CLASS")

 ## Using 'iris' dataset to determine [n.best] and [threshold] with no test set.
 NNS.stack(iris[ , 1:4], iris[ , 5], type = "CLASS")

 ## Selecting NNS.reg and dimension reduction techniques.
 NNS.stack(iris[1:140, 1:4], iris[1:140, 5], iris[141:150, 1:4], method = c(1, 2), type = "CLASS")
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab