BIOMOD_Modeling: Run a range of species distribution models

Description

This function allows to calibrate and evaluate a range of species distribution models techniques run over a given species. Calibrations are made on the whole sample or a random subpart. The predictive power of the different models is estimated using a range of evaluation metrics.

Usage

BIOMOD_Modeling( data, 
                 models = c('GLM','GBM','GAM','CTA','ANN',
                            'SRE','FDA','MARS','RF','MAXENT'), 
                 models.options = NULL, 
                 NbRunEval=1, 
                 DataSplit=100, 
                 Yweights=NULL,
                 Prevalence=NULL,
                 VarImport=0, 
                 models.eval.meth = c('KAPPA','TSS','ROC'), 
                 SaveObj = TRUE,
                 rescal.all.models = FALSE,
                 do.full.models = TRUE,
                 modeling.id = as.character(format(Sys.time(), '%s')),
                 ...)

Arguments

data

BIOMOD.formated.data object returned by BIOMOD_FormatingData

models

vector of models names choosen among 'GLM', 'GBM', 'GAM', 'CTA', 'ANN', 'SRE', 'FDA', 'MARS', 'RF' and 'MAXENT'

models.options

BIOMOD.models.options object returned by BIOMOD_ModelingOptions

NbRunEval

Number of Evaluation run

DataSplit

% of data used to calibrate the models, the remaining part will be used for testing

Yweights

response points weights

Prevalence

either NULL (default) or a 0-1 numeric used to build 'weighted response weights'

VarImport

Number of permutation to estimate variable importance

models.eval.meth

vector of names of evaluation metric among 'KAPPA', 'TSS', 'ROC', 'FAR', 'SR', 'ACCURACY', 'BIAS', 'POD', 'CSI' and 'ETS'

SaveObj

keep all results and outputs on hard drive or not (NOTE: strongly recommended)

rescal.all.models

if true, all model prediction will be scaled with a binomial GLM

do.full.models

if true, models calibrated and evaluated with the whole dataset are done

modeling.id

character, the ID (=name) of modeling procedure. A random number by default.

...

further arguments :

DataSplitTable: amatrix,data.frameor a 3Darrayfilled withTRUE/FALSEto specify which part of data must be used for models calibration (TRUE) and f

Value

A BIOMOD.models.out object See "BIOMOD.models.out" for details. Additional objects are stored out of R in two different directories for memory storage purposes. They are created by the function directly on the root of your working directory set in R ("models" directory). This one contains each calibrated model for each repetition and pseudo-absence run. A hidden folder .DATA_BIOMOD contains some files (predictions, original dataset copy, pseudo absences chosen...) used by other functions like BIOMOD_Projection or BIOMOD_EnsembleModeling . The models are currently stored as objects to be read exclusively in R. To load them back (the same stands for all objects stored on the hard disk) use the load function (see examples section below).

item

NbRunEval & DataSplit
Yweights & Prevalence
models.eval.meth
KAPPA : Cohen's Kappa (Heidke skill score)
TSS : True kill statistic (Hanssen and Kuipers discriminant, Peirce's skill score)
FAR : False alarm ratio
SR : Success ratio
ACCURANCY : Accuracy (fraction correct)
BIAS : Bias score (frequency bias)
POD : Probability of detection (hit rate)
CSI : Critical success index (threat score)
ETS : Equitable threat score (Gilbert skill score)
SaveObj
rescal.all.models
do.full.models

code

TRUE

itemize

ROC: Relative Operating Characteristic

url

http://www.cawcr.gov.au/projects/verification/#Methods_for_dichotomous_forecasts

sQuote

ensemble modelled
MARS
FDA
ANN

Details

data

{ If you have decide to add pseudo absences to your original dataset (see BIOMOD_FormatingData), NbPseudoAbsences * NbRunEval + 1 models will be created. } models{ The set of models to be calibrated on the data. 10 modeling techniques are currently available:

GLM : Generalized Linear Model (glm)

GAM : Generalized Additive Model (gam, gam or bam, see BIOMOD_ModelingOptions for details on algorithm selection) GBM : Generalized Boosting Model or usually called Boosted Regression Trees (gbm) CTA : Classification Tree Analysis (rpart) ANN : Artificial Neural Network (nnet) SRE : Surface Range Envelop or usually called BIOCLIM FDA : Flexible Discriminant Analysis (fda) MARS : Multiple Adaptive Regression Splines (mars) RF : Random Forest (randomForest) MAXENT : Maximum Entropy (http://www.cs.princeton.edu/~schapire/maxent/) }

Examples

Run this code

# species occurrences
DataSpecies <- read.csv(system.file("external/species/mammals_table.csv",
                                    package="biomod2"))
head(DataSpecies)

# the name of studied species
myRespName <- 'GuloGulo'

# the presence/absences data for our species 
myResp <- as.numeric(DataSpecies[,myRespName])

# the XY coordinates of species data
myRespXY <- DataSpecies[,c("X_WGS84","Y_WGS84")]


# Environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)
myExpl = stack( system.file( "external/bioclim/current/bio3.grd", 
                             package="biomod2"),
                system.file( "external/bioclim/current/bio4.grd", 
                             package="biomod2"), 
                system.file( "external/bioclim/current/bio7.grd", 
                             package="biomod2"),  
                system.file( "external/bioclim/current/bio11.grd", 
                             package="biomod2"), 
                system.file( "external/bioclim/current/bio12.grd", 
                             package="biomod2"))

# 1. Formatting Data
myBiomodData <- BIOMOD_FormatingData(resp.var = myResp,
                                     expl.var = myExpl,
                                     resp.xy = myRespXY,
                                     resp.name = myRespName)
                                                                     
# 2. Defining Models Options using default options.
myBiomodOption <- BIOMOD_ModelingOptions()

# 3. Doing Modelisation

myBiomodModelOut <- BIOMOD_Modeling( myBiomodData, 
                                       models = c('SRE','RF'), 
                                       models.options = myBiomodOption, 
                                       NbRunEval=2, 
                                       DataSplit=80, 
                                       VarImport=0, 
                                       models.eval.meth = c('ROC'),
                                       do.full.models=FALSE,
                                       modeling.id="test")
                                       
## print a summary of modeling stuff
myBiomodModelOut

Run the code above in your browser using DataLab