Create an object of class big
which is needed to perform the selection
procedure.
prepare_data(y, X, type = "linear", candidates = NULL, Xadd = NULL,
na = NULL, maxp = 1e+06, verbose = TRUE)
a numeric vector of dependent (target) variable.
a numeric matrix or an object of class big.matrix
. The
columns of X
should contain dependent variables (predictors).
a string, type of the regression model you want to fit. You can
use one of these: "linear"
, "logistic"
, "poisson"
.
a numeric vector, columns from X
which will be used
in the selection procedure. The order is important. If NULL
, every
column will be used.
a numeric matrix, additional variables which will be included in
the model selection procedure (they will not be removed in any step). If
NULL
, Xadd
will contain only a column of ones (the
intercept). If you specify Xadd
, a column of ones will be
automatically added (it is impossible to not include the intercept).
a logical. There are any missing values in X
? If
NULL
, it will be checked (it can take some time if X
is big,
so it is reasonable to set it).
a numeric. The matrix X
will be splitted into parts with
maxp
elements. It will not change results, but it is necessary if
your computer does not have enough RAM. Set to a lower value if you still
have problems.
a logical. Set FALSE
if you do not want to see any
information during the selection procedure.
An object of class big
.
The function automatically removes observations which have missing
values in y
. Type browseVignettes("bigstep")
for more
details.
# NOT RUN {
X <- matrix(rnorm(20), ncol = 4)
y <- X[, 2] + rnorm(5)
data <- prepare_data(y, X)
# }
Run the code above in your browser using DataLab