BIOMOD_ModelingOptions: Configure the modeling options for each selected model

Description

Parametrize and/or tune biomod's single models options.

Usage

BIOMOD_ModelingOptions(GLM = NULL,
                         GBM = NULL,
                         GAM = NULL,
                         CTA = NULL,
                         ANN = NULL,
                         SRE = NULL,
                         FDA = NULL,
                         MARS = NULL,
                         RF = NULL,
                         MAXENT.Phillips = NULL)

Value

A "BIOMOD.Model.Options" object given to BIOMOD_Modeling

Arguments

GLM: list, GLM options
GBM: list, GBM options
GAM: list, GAM options
CTA: list, CTA options
ANN: list, ANN options
SRE: list, SRE options
FDA: list, FDA options
MARS: list, MARS options
RF: list, RF options
MAXENT.Phillips: list, MAXENT.Phillips options

GLM (<code><a href="/link/glm?package=biomod2&version=3.5.1" data-mini-rdoc="biomod2::glm">glm</a></code>)

myFormula : a typical formula object (see example). If not NULL, type and interaction.level args are switched off. You can choose to either:
- generate automatically the GLM formula by using the type and interaction.level arguments type (default 'quadratic') : formula given to the model ('simple', 'quadratic' or 'polynomial'). interaction.level (default 0) : integer corresponding to the interaction level between variables considered. Consider that interactions quickly enlarge the number of effective variables used into the GLM.
- or construct specific formula
test (default 'AIC') : Information criteria for the stepwise selection procedure: AIC for Akaike Information Criteria, and BIC for Bayesian Information Criteria ('AIC' or 'BIC'). 'none' is also a supported value which implies to concider only the full model (no stepwise selection). This can lead to convergence issu and strange results.
family (default binomial(link = 'logit')) : a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions.) . BIOMOD only runs on presence-absence data so far, so binomial family by default.
control : a list of parameters for controlling the fitting process. For glm.fit this is passed to glm.control.

GBM (default <code>gbm</code>)

Please refer to gbm help file to get the meaning of this options.

distribution (default 'bernoulli')
n.trees (default 2500)
interaction.depth (default 7)
n.minobsinnode (default 5)
shrinkage (default 0.001)
bag.fraction (default 0.5)
train.fraction (default 1)
cv.folds (default 3)
keep.data (default FALSE)
verbose (default FALSE)
perf.method (default 'cv')
n.cores (default 1)

GAM (<code>gam</code> or <code><a href="/link/gam?package=biomod2&version=3.5.1" data-mini-rdoc="biomod2::gam">gam</a></code>)

algo : either "GAM_gam" (default), "GAM_mgcv" or "BAM_mgcv" defining the chosen GAM function (see gam, gam resp. bam for more details)
myFormula : a typical formula object (see example). If not NULL, type and interaction.level args are switched off. You can choose to either:
- generate automatically the GAM formula by using the type and interaction.level arguments type : the smother used to generate the formula. Only "s_smoother" available at time. interaction.level : integer corresponding to the interaction level between variables considered. Consider that interactions quickly enlarge the number of effective variables used into the GAM. Interaction are not considered if you choosed "GAM_gam" algo
- or construct specific formula
k (default -1 or 4): a smooth term in a formula argument to gam (see gam s or mgcv s)
family (default binomial(link = 'logit')) : a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. (See family for details of family functions.) . BIOMOD only runs on presence-absence data so far, so binomial family by default.
control : see gam.control or gam.control
some extra "GAM_mgcv" specific options (ignored if algo = "GAM_gam")
- method (default 'GCV.Cp')
- optimizer (default c('outer','newton'))
- select (default FALSE)
- knots (default NULL)
- paramPen (default NULL)

CTA (<code><a href="/link/rpart?package=biomod2&version=3.5.1" data-mini-rdoc="biomod2::rpart">rpart</a></code>)

Please refer to rpart help file to get the meaning of the following options.

method (default 'class')
parms (default 'default') : if 'default', default rpart parms value are kept
cost (default NULL)
control: see rpart.control

NOTE: for method and parms, you can give a 'real' value as described in the rpart help file or 'default' that implies default rpart values.

ANN (<code><a href="/link/nnet?package=biomod2&version=3.5.1" data-mini-rdoc="biomod2::nnet">nnet</a></code>)

NbCV (default 5) : nb of cross validation to find best size and decay parameters
size (default NULL) : number of units in the hidden layer. If NULL then size parameter will be optimised by cross validation based on model AUC (NbCv cross validation; tested size will be the following c(2,4,6, 8) ). You can also specified a vector of size you want to test. The one giving the best model AUC will be then selected.
decay (default NULL) : parameter for weight decay. If NULL then decay parameter will be optimised by cross validation on model AUC (NbCv cross validation; tested decay will be the following c(0.001, 0.01, 0.05, 0.1) ). You can also specified a vector of decay you want to test. The one giving the best model AUC will be then selected.
rang (default 0.1) : Initial random weights on [-rang, rang]
maxit (default 200): maximum number of iterations.

SRE (<code>sre</code>)

quant (default 0.025): quantile of 'extreme environmental variable' removed for selection of species envelops

FDA (<code>fda</code>)

Please refer to fda help file to get the meaning of these options.

method (default 'mars')
add_args (default NULL) : additional arguments to method given as a list of parameters (corespond to the ... options of fda function)

MARS (<code>earth</code>)

Please refer to earth help file to get the meaning of these options.

myFormula : a typical formula object (see example). If not NULL, type and interaction.level args are switched off. You can choose to either:
- generate automatically the GLM formula by using the type and interaction.level arguments type (default 'simple') : formula given to the model ('simple', 'quadratic' or 'polynomial'). interaction.level (default 0) : integer corresponding to the interaction level between variables considered. Consider that interactions quickly enlarge the number of effective variables used into the GLM/MARS.
- or construct specific formula
nk (default NULL) : an optional integer specifying the maximum number of model terms. If NULL is given then default mars function value is used ( i.e max(21, 2 * nb_expl_var + 1) )
penalty (default 2)
thresh (default 0.001)
nprune (default NULL)
pmethod (default "backward")

RF (<code><a href="/link/randomForest?package=biomod2&version=3.5.1" data-mini-rdoc="biomod2::randomForest">randomForest</a></code>)

do.classif (default TRUE) : if TRUE classification random.forest computed else regression random.forest will be done
ntree (default 500)
mtry (default 'default')
nodesize (default 5)
maxnodes (default NULL)

NOTE: for mtry, you can give a 'real' value as described in randomForest help file or 'default' that implies default randomForest values

[MAXENT.Phillips](https

//biodiversityinformatics.amnh.org/open_source/maxent/) :

path_to_maxent.jar : character, the link to maxent.jar file (the working directory by default)
memory_allocated : integer (default 512), the amount of memory (in Mo) reserved for java to run MAXENT.Phillips. should be 64, 128, 256, 512, 1024, 2048... or NULL if you want to use default java memory limitation parameter.
background_data_dir : character, path to a directory where explanatory variables are stored as ASCII files (raster format). If specified MAXENT will generate it's own background data from expalantory variables rasters (as usually done in MAXENT studies). If not set, then MAXENT will use the same pseudo absences than other models (generated within biomod2 at formatting step) as background data.
maximumbackground : integer, the maximum number of background data to sample. This parameter will be use only if background_data_dir option has been set to a non default value.
maximumiterations : integer (default 200), maximum iteration done
visible : logical (default FALSE), make the Maxent user interface visible
linear : logical (default TRUE), allow linear features to be used
quadratic : logical (default TRUE), allow quadratic features to be used
product : logical (default TRUE), allow product features to be used
threshold : logical (default TRUE), allow threshold features to be used
hinge : logical (default TRUE), allow hinge features to be used
lq2lqptthreshold : integer (default 80), number of samples at which product and threshold features start being used
l2lqthreshold : integer (default 10), number of samples at which quadratic features start being used
hingethreshold : integer (default 15), number of samples at which hinge features start being used
beta_threshold : numeric (default -1.0), regularization parameter to be applied to all threshold features; negative value enables automatic setting
beta_categorical : numeric (default -1.0), regularization parameter to be applied to all categorical features; negative value enables automatic setting
beta_lqp : numeric (default -1.0), regularization parameter to be applied to all linear, quadratic and product features; negative value enables automatic setting
beta_hinge : numeric (default -1.0), regularization parameter to be applied to all hinge features; negative value enables automatic setting
betamultiplier : numeric (default 1), multiply all automatic regularization parameters by this number. A higher number gives a more spread-out distribution.
defaultprevalence : numeric (default 0.5), default prevalence of the species: probability of presence at ordinary occurrence points

Author

Damien Georges, Wilfried Thuiller

Details

The aim of this function is to allow advanced user to change some default parameters of BIOMOD inner models. For each modeling technique, options can be set up.

Each argument have to be put in a list object.

The best way to use this function is to print defaut models options (Print_Default_ModelingOptions) or create a default 'BIOMOD.model.option object' and print it in your console. Then copy the output, change only the required parameters, and paste it as function arguments. (see example)

Here the detailed list of modifiable parameters. They correspond to the traditional parameters that could be setted out for each modeling technique (e.g. ?GLM)

Examples

Run this code

  ## default BIOMOD.model.option object
  myBiomodOptions <- BIOMOD_ModelingOptions()

  ## print the object
  myBiomodOptions

  ## you can copy a part of the print, change it and custom your options
  ## here we want to compute quadratic GLM and select best model with 'BIC' criterium
  myBiomodOptions <- BIOMOD_ModelingOptions(
    GLM = list( type = 'quadratic',
                interaction.level = 0,
                myFormula = NULL,
                test = 'BIC',
                family = 'binomial',
                control = glm.control(epsilon = 1e-08,
                                      maxit = 1000,
                                      trace = FALSE) ))

  ## check changes was done
  myBiomodOptions

  ##' you can prefer to establish your own GLM formula
  myBiomodOptions <- BIOMOD_ModelingOptions(
    GLM = list( myFormula = formula("Sp277 ~ bio3 +
                    log(bio10) + poly(bio16,2) + bio19 + bio3:bio19")))

  ## check changes was done
  myBiomodOptions

  ##' you also can directly print default parameters and then follow the same processus
  Print_Default_ModelingOptions()

Run the code above in your browser using DataLab