Learn R Programming

partykit (version 1.0-0)

mob_control: Control Parameters for Model-Based Partitioning

Description

Various parameters that control aspects the fitting algorithm for recursively partitioned mob models.

Usage

mob_control(alpha = 0.05, bonferroni = TRUE, minsize = NULL, maxdepth = Inf,
  mtry = Inf, trim = 0.1, breakties = FALSE, parm = NULL, dfsplit = TRUE, prune = NULL,
  restart = TRUE, verbose = FALSE, caseweights = TRUE, ytype = "vector", xtype = "matrix",
  terminal = "object", inner = terminal, model = TRUE, numsplit = "left",
  catsplit = "binary", vcov = "opg", ordinal = "chisq", nrep = 10000,
  minsplit = minsize, minbucket = minsize, applyfun = NULL, cores = NULL)

Arguments

alpha
numeric significance level. A node is splitted when the (possibly Bonferroni-corrected) $p$ value for any parameter stability test in that node falls below alpha (and the stopping criteria minsize and maxdepth
bonferroni
logical. Should $p$ values be Bonferroni corrected?
minsize, minsplit, minbucket
integer. The minimum number of observations in a node. If NULL, the default is to use 10 times the number of parameters to be estimated (divided by the number of responses per observation if that is greater than 1). minsize<
maxdepth
integer. The maximum depth of the tree.
mtry
integer. The number of partitioning variables randomly sampled as candidates in each node for forest-style algorithms. If mtry is greater than the number of partitioning variables, no random selection is performed. (Thus, by defau
trim
numeric. This specifies the trimming in the parameter instability test for the numerical variables. If smaller than 1, it is interpreted as the fraction relative to the current node size.
breakties
logical. Should ties in numeric variables be broken randomly for computing the associated parameter instability test?
parm
numeric or character. Number or name of model parameters included in the parameter instability tests (by default all parameters are included).
dfsplit
logical or numeric. as.integer(dfsplit) is the degrees of freedom per selected split employed when computing information criteria etc.
prune
character, numeric, or function for specifying post-pruning rule. If prune is NULL (the default), no post-pruning is performed. For likelihood-based mob() trees, prune can be set to "AI
restart
logical. When determining the optimal split point in a numerical variable: Should model estimation be restarted with NULL starting values for each split? The default is TRUE. If FALSE, then the parameter
verbose
logical. Should information about the fitting process of mob (such as test statistics, $p$ values, selected splitting variables and split points) be printed to the screen?
caseweights
logical. Should weights be interpreted as case weights? If TRUE, the number of observations is sum(weights), otherwise it is sum(weights > 0).
ytype, xtype
character. Specification of how mob should preprocess y and x variables. Possible choice are: "vector" (for y only), i.e., only one variable; "matrix", i.e., the mod
terminal, inner
character. Specification of which additional information ("estfun", "object", or both) should be stored in each node. If NULL, no additional information is stored.
model
logical. Should the full model frame be stored in the resulting object?
numsplit
character indicating how splits for numeric variables should be justified. Because any splitpoint in the interval between the last observation from the left child segment and the first observation from the right child segment leads to the same
catsplit
character indicating how (unordered) categorical variables should be splitted. By default the best "binary" split is searched (by minimizing the objective function). Alternatively, if set to "multiway", the node is si
vcov
character indicating which type of covariance matrix estimator should be employed in the parameter instability tests. The default is the outer product of gradients ("opg"). Alternatively, vcov = "info" employs the inf
ordinal
character indicating which type of parameter instability test should be employed for ordinal partitioning variables (i.e., ordered factors). This can be "chisq", "max", or "L2". If "chisq" th
nrep
numeric. Number of replications in the simulation of p-values for the ordinal "L2" statistic (if used).
applyfun
an optional lapply-style function with arguments function(X, FUN, ...). It is used for refitting the model across potential sample splits. The default is to use the basic lappl
cores
numeric. If set to an integer the applyfun is set to mclapply with the desired number of cores.

Value

  • A list of class mob_control containing the control parameters.

Details

See mob for more details and references. For post-pruning, prune can be set to a function(objfun, df, nobs) which either returns TRUE to signal that a current node can be pruned or FALSE. All supplied arguments are of length two: objfun is the sum of objective function values in the current node and its child nodes, respectively. df is the degrees of freedom in the current node and its child nodes, respectively. nobs is vector with the number of observations in the current node and the total number of observations in the dataset, respectively.

If the objective function employed in the mob() call is the negative log-likelihood, then a suitable function is set up on the fly by comparing (2 * objfun + penalty * df) in the current and the daughter nodes. The penalty can then be set via a numeric or character value for prune: AIC is used if prune = "AIC" or prune = 2 and BIC if prune = "BIC" or prune = log(n).

See Also

mob