Learn R Programming

biomod2 (version 3.1-25)

BIOMOD_EnsembleModeling: Create and evaluate an ensemble set of models and predictions

Description

BIOMOD_EnsembleModeling combines models and make ensemble predictions built with BIOMOD_Modeling. The ensemble predictions can also be evaluated against the original data given to BIOMOD_Modeling. Biomod2 proposes a range of options to build ensemble models and predictions and to assess the modeling uncertainty. The created ensemble models can then be used to project distributions over space and time as classical biomod2 models.

Usage

BIOMOD_EnsembleModeling( modeling.output,
                         chosen.models = 'all',
                         em.by = 'all',
                         eval.metric = 'all',
                         eval.metric.quality.threshold = NULL,
                         prob.mean = TRUE,
                         prob.cv = FALSE,
                         prob.ci = FALSE,
                         prob.ci.alpha = 0.05,
                         prob.median = FALSE,
                         committee.averaging = FALSE,
                         prob.mean.weight = FALSE,
                         prob.mean.weight.decay = 'proportional',
                         VarImport = 0)

Arguments

modeling.output
chosen.models
a character vector (either 'all' or a sub-selection of model names) that defines the models kept for building the ensemble models (might be useful for removing some non-preferred models)
em.by
Character. Flag defining the way the models will be combined to build the ensemble models. Available values are 'PA_dataset+repet' (default), 'PA_dataset+algo', 'PA_dataset', 'algo' and 'all'
eval.metric
vector of names of evaluation metric. If 'all', the same evaluation metrics than those of modeling.output will be automatically selected
eval.metric.quality.threshold
If not NULL, the minimum scores below which models will be excluded of the ensemble-models building.
prob.mean
Logical. Estimate the mean probabilities across predictions
prob.cv
Logical. Estimate the coefficient of variation across predictions
prob.ci
Logical . Estimate the confidence interval around the prob.mean
prob.ci.alpha
Numeric. Significance level for estimating the confidence interval. Default = 0.05
prob.median
Logical. Estimate the mediane of probabilities
committee.averaging
Logical. Estimate the committee averaging across predictions
prob.mean.weight
Logical. Estimate the weighted sum of probabilities
prob.mean.weight.decay
Define the relative importance of the weights. A high value will strongly discriminate the 'good' models from the 'bad' ones (see the details section). If the value of this parameter is set to 'proportional' (default), then the attributed weights are prop
VarImport
Number of permutation to estimate variable importance

Value

  • A "BIOMOD.EnsembleModeling.out". This object will be later given to BIOMOD_EnsembleForecasting if you want to make some projections of this ensemble-models.

    You can access to evaluation scores with the get_evaluations function and to the built models names with the get_built_models function (see example).

item

  • Evaluation metrics
  • to make the binary transformation needed for committee averaging computation
  • to weight the models in the probability weighted mean model
  • to test (and/or evaluate) your ensemble-models forecasting ability (at this step, each ensemble-model (ensemble will be evaluated according to each evaluation metric)
  • eval.metric.quality.threshold
  • Ensemble-models algorithms
  • Coefficient of variation of Probabilities (prob.cv)
  • Confidence interval (prob.ci & prob.ci.alpha)
  • The lower one (there is less than a 100*prob.ci.alpha/2 % of chance to get probabilities lower the than given ones)
  • Median of probabilities (prob.median)
  • Models committee averaging (committee.averaging)
  • Weighted mean of probabilities (prob.mean.weight & prob.mean.weight.decay)

itemize

  • The upper one (there is less than a 100*prob.ci.alpha/2 \% of chance to get probabilities upper than the given ones)

code

prob.mean.weight.decay

enumerate

  1. Mean of probabilities (prob.mean)

deqn

$$I_c = [ \bar{x} - \frac{t_\alpha sd }{ \sqrt{n} }; \bar{x} + \frac{t_\alpha sd }{ \sqrt{n} }]$$

sQuote

testing dataset

Details

  1. Models sub-selection (chosen.models)
{

Useful to exclude some models that have been selected in the previous steps (modeling.output). This vector of model names can be access applying get_built_models to your modeling.output data. It makes easier the selection of models. The default value (i.e. all) will kept all available models.} Models assembly rules (em.by){ Please refer to ../doc/EnsembleModelingAssembly.pdf{EnsembleModelingAssembly} vignette that is dedicated to this parameter. 5 different ways to combine models can be considered. You can make ensemble models considering :

  • Dataset used for models building (Pseudo Absences dataset and repetitions done)
{: 'PA_dataset+repet'} Dataset used and statistical models{: 'PA_dataset+algo'} Pseudo-absences selection dataset{: 'PA_dataset'} Statistical models{: 'algo'} A total consensus model{: 'all'} } The value chosen for this parameter will control the number of ensemble models built. If no evaluation data was given the at BIOMOD_FormatingData step, some ensemble models evaluation may be a bit unfair because the data that will be used for evaluating ensemble models could differ from those used for evaluate BIOMOD_Modeling models (in particular, some data used for 'basal models' calibration can be re-used for ensemble models evaluation). You have to keep it in mind ! (../doc/EnsembleModelingAssembly.pdf{EnsembleModelingAssembly} vignette for extra details)

See Also

BIOMOD_Modeling, BIOMOD_Projection, BIOMOD_EnsembleForecasting

Examples

Run this code
# species occurrences
DataSpecies <- read.csv(system.file("external/species/mammals_table.csv",
                                    package="biomod2"), row.names = 1)
head(DataSpecies)

# the name of studied species
myRespName <- 'GuloGulo'

# the presence/absences data for our species 
myResp <- as.numeric(DataSpecies[,myRespName])

# the XY coordinates of species data
myRespXY <- DataSpecies[,c("X_WGS84","Y_WGS84")]


# Environmental variables extracted from BIOCLIM (bio_3, bio_4, bio_7, bio_11 & bio_12)
myExpl = stack( system.file( "external/bioclim/current/bio3.grd", 
                     package="biomod2"),
                system.file( "external/bioclim/current/bio4.grd", 
                             package="biomod2"), 
                system.file( "external/bioclim/current/bio7.grd", 
                             package="biomod2"),  
                system.file( "external/bioclim/current/bio11.grd", 
                             package="biomod2"), 
                system.file( "external/bioclim/current/bio12.grd", 
                             package="biomod2"))

# 1. Formatting Data
myBiomodData <- BIOMOD_FormatingData(resp.var = myResp,
                                     expl.var = myExpl,
                                     resp.xy = myRespXY,
                                     resp.name = myRespName)
       
# 2. Defining Models Options using default options.
myBiomodOption <- BIOMOD_ModelingOptions()

# 3. Doing Modelisation

myBiomodModelOut <- BIOMOD_Modeling( myBiomodData, 
                                       models = c('SRE','CTA','RF'), 
                                       models.options = myBiomodOption, 
                                       NbRunEval=1, 
                                       DataSplit=80, 
                                       Yweights=NULL, 
                                       VarImport=3, 
                                       models.eval.meth = c('TSS'),
                                       SaveObj = TRUE,
                                       rescal.all.models = FALSE,
                                       do.full.models = FALSE)
                                       
# 4. Doing Ensemble Modelling
myBiomodEM <- BIOMOD_EnsembleModeling( modeling.output = myBiomodModelOut,
                           chosen.models = 'all',
                           em.by = 'all',
                           eval.metric = c('TSS'),
                           eval.metric.quality.threshold = c(0.7),
                           prob.mean = TRUE,
                           prob.cv = FALSE,
                           prob.ci = FALSE,
                           prob.ci.alpha = 0.05,
                           prob.median = FALSE,
                           committee.averaging = FALSE,
                           prob.mean.weight = TRUE,
                           prob.mean.weight.decay = 'proportional' )   
                                       
# print summary
myBiomodEM

# get evaluation scores
get_evaluations(myBiomodEM)

Run the code above in your browser using DataLab