Usage
polymars(responses, predictors, maxsize, gcv = 4, additive = FALSE,
startmodel, weights, no.interact, knots, knot.space = 3, ts.resp,
ts.pred, ts.weights, classify, factors, tolerance, verbose = FALSE)
Arguments
responses
vector of responses, or a matrix for multiple response regression.
In the case of a matrix each column corresponds to a response and each
row corresponds to an observation. Missing values are not allowed.
predictors
matrix of predictor variables for the regression. Each column corresponds to a
predictor and each row corresponds to an observation in the same order as
they appear in the response argument. Missing values are not allowed.
maxsize
the maximum number of basis functions that the model is allowed to grow to in
the stepwise addition procedure. Default is
$\min(6*(n^{1/3}),n/4,100)$, where n
is the number of observations.
gcv
parameter used to find the overall best model from a sequence of fitted models.
The residual sum of squares of a model is penalized by dividing by the square of
1-(gcv x model size)/cases
.
A larger gcv value would tend to produce a sma
additive
Should the fitted model be additive in the predictors?
startmodel
the first model that is to be fit by polymars
. It is either an
object of the class polymars
or a model dreamed up by the user.
In that case,
it takes the form of a 4 x n
matrix, where
n
is the numbe
weights
optional vector of observation weights; if supplied, the algorithm fits to minimize the
sum of the weights multiplied by the squared residuals. The length of
weights must be the same as the number of observations. The weights must
be nonnegative.
no.interact
an optional matrix used if certain predictor interactions are not allowed in the model.
It is given as a matrix of size 2 x m
, with predictor indices as entries. The two
predictors of any row cannot have interaction terms with each other.
knots
defines how the function is to find potential knots for the spline basis
functions. This can be set to the maximum number of knots you would
like to be considered for each predictor.
Usually, to avoid the design matrix becoming singular the actual num
knot.space
is an integer describing the minimum number of order statistics apart that
two knots can be. Knots should not be too close to insure numerical stability.
ts.resp
testset responses for model selection. Should have the same number of columns
as the training set response. A testset can be used for the model selection.
Depending on the value of classify, either the model with the smallest testset
residual sum of sq
ts.pred
testset predictors. Should have the same number of columns
as the training set predictors.
ts.weights
testset observation weights. A vector of length equal to the number of cases
of the testset. All weights must be non-negative.
classify
when the response is discrete (categorical), polymars can be used for
classification. In particular, when classify = TRUE
, a discrete response
with K
levels is replaced by K
indicator variables as response. Model
factors
used to indicate that certain variables in the predictor set are categorical
variables. Specified as a vector containing the appropriate predictor
indices (column numbers of categorical variables in predictors matrix). Factors
can also be set when t
tolerance
for each possible candidate to be added/deleted the resulting residual sums
of squares of the model, with/without this candidate, must be calculated.
The inversion of of the "X-transpose by X" matrix, X being the design matrix,
is done by an updatin
verbose
when set to TRUE
, the function will print out a line for each addition or deletion
stage. For example, " + 8 : 5 3.25 2 NA" means adding interaction basis function
of predictor 5 with knot at 3.25 and predictor 2 (linear),
to make a mod