Create an object of class big which is needed to perform the selection
procedure.
prepare_data(y, X, type = "linear", candidates = NULL, Xadd = NULL,
na = NULL, maxp = 1e+06, verbose = TRUE)a numeric vector of dependent (target) variable.
a numeric matrix or an object of class big.matrix. The
columns of X should contain dependent variables (predictors).
a string, type of the regression model you want to fit. You can
use one of these: "linear", "logistic", "poisson".
a numeric vector, columns from X which will be used
in the selection procedure. The order is important. If NULL, every
column will be used.
a numeric matrix, additional variables which will be included in
the model selection procedure (they will not be removed in any step). If
NULL, Xadd will contain only a column of ones (the
intercept). If you specify Xadd, a column of ones will be
automatically added (it is impossible to not include the intercept).
a logical. There are any missing values in X? If
NULL, it will be checked (it can take some time if X is big,
so it is reasonable to set it).
a numeric. The matrix X will be splitted into parts with
maxp elements. It will not change results, but it is necessary if
your computer does not have enough RAM. Set to a lower value if you still
have problems.
a logical. Set FALSE if you do not want to see any
information during the selection procedure.
An object of class big.
The function automatically removes observations which have missing
values in y. Type browseVignettes("bigstep") for more
details.
# NOT RUN {
X <- matrix(rnorm(20), ncol = 4)
y <- X[, 2] + rnorm(5)
data <- prepare_data(y, X)
# }
Run the code above in your browser using DataLab