- formula
A formula
that can be passed to the model
argument to define the classification algorithm
- model
A binary classification model supplied by the user. Must take arguments formula
and data
- data
Optional. A rectangular data.frame
object giving the full data from which samples are to be drawn. If left unspecified, gendata()
is called to produce synthetic data with an appropriate structure.
- dim
Required if data
is unspecified. Gives the horizontal dimension of the data (number of predictor variables) to be generated.
- maxn
Required if data
is unspecified. Gives the vertical dimension of the data (number of observations) to be generated.
- upperlimit
Optional. A positive integer giving the maximum sample size to be simulated, if data was supplied.
- nsample
A positive integer giving the number of samples to be generated for each value of $n$. Larger values give more accurate results.
- steps
A positive integer giving the interval of values of $n$ for which simulations should be conducted. Larger values give more accurate results.
- eta
A real number between 0 and 1 giving the probability of misclassification error in the training data.
- delta
A real number between 0 and 1 giving the targeted maximum probability of observing an OOS error rate higher than epsilon
- epsilon
A real number between 0 and 1 giving the targeted maximum out-of-sample (OOS) error rate
- predictfn
An optional user-defined function giving a custom predict method. If also using a user-defined model, the model
should output an object of class "svrclass"
to avoid errors.
- power
A logical indicating whether experimental power based on the predictions should also be reported
- effect_size
If power
is TRUE
, a real number indicating the scaled effect size the user would like to be able to detect.
- powersims
If power
is TRUE
, an integer indicating the number of simulations to be conducted at each step to calculate power.
- alpha
If power
is TRUE
, a real number between 0 and 1 indicating the probability of Type I error to be used for hypothesis testing. Default is 0.05.
- parallel
Boolean indicating whether or not to use parallel processing.
- coreoffset
If parallel
is true, a positive integer indicating the number of free threads to be kept unused. Should not be larger than the number of CPU cores.
- packages
A list of packages that need to be loaded in order to run model
.
- method
An optional string stating the distribution from which data is to be generated. Default is i.i.d. uniform sampling. Can also take a function outputting a vector of probabilities if the user wishes to specify a custom distribution.
- p
If method is 'Class Imbalance', gives the degree of weight placed on the positive class.
- minn
Optional argument to set a different minimum n than the dimension of the algorithm. Useful with e.g. regularized regression models such as elastic net.
- x
Optional argument for methods that take separate predictor and outcome data. Specifies a matrix-like object containing predictors. Note that if used, the x and y objects are bound together columnwise; this must be handled in the user-supplied helper function.
- y
Optional argument for methods that take separate predictor and outcome data. Specifies a vector-like object containing outcome values. Note that if used, the x and y objects are bound together columnwise; this must be handled in the user-supplied helper function.
- ...
Additional arguments that need to be passed to model