Learn R Programming

MaxentVariableSelection (version 1.0-1)

VariableSelection: Selecting the best set of relevant environmental variables along with the optimal regularization multiplier for Maxent Niche Modeling

Description

This is the core function of the package in which a set of environmental variables is reduced in a stepwise fashion in order to avoid overfitting the model to the occurrence records. This can be done for a range of regularization multipliers. The best performing model, based on AICc values (Akaike, 1974) or AUC.Test values (Fielding and Bell, 1997), identifies then the most-important uncorrelated environmental variables along with the optimal regularization multiplier.

Usage

VariableSelection(maxent, outdir, gridfolder, occurrencelocations,
backgroundlocations, additionalargs, contributionthreshold,
correlationthreshold, betamultiplier)

Arguments

maxent
String specifying the filepath to the maxent.jar file (download from here: https://www.cs.princeton.edu/~schapire/maxent/). The package was tested with maxent.jar version 3.3.3k.
outdir
String specifying the path to the output directory to which all the result files will be written.Please don't put important files in this folder as all files but the output files of the VariableSelection function will be deleted from this fold
gridfolder
String specifying the path to the directory that holds all the ASCII grids (in ESRI's .asc format) of environmental variables. All variables must have the same extent and resolution.
occurrencelocations
String specifying the filepath to the csv file with occurrence records. Please find the exact specifications of the SWD file format in the details section below.
backgroundlocations
String specifying the filepath to the csv file with background/pseudoabsence data. Please find the exact specifications of the SWD file format in the details section below.
additionalargs
String specifying additional maxent arguments. Please see in the details section below.
betamultiplier
Vector of beta (regularization multipliers) (positive numerical values). The smaller this value, the more closely will the projected distribution fit to the training data set. Overfitted models are poorly transferable to novel environments
correlationthreshold
Numerical value (between 0 and 1) that sets the threshold of Pearson's correlation coefficient above which environmental variables are regarded to be correlated (based on values at all background locations). Of the correlated variables, on
contributionthreshold
Numerical value (between 0 and 100) that sets the threshold of model contributions below which environmental variables are excluded from the Maxent model. Model contributions reflect the importance of environmental variables in limiting th

Value

  • The following result files are saved in the directory specified with the outdir argument.
  • ModelPerformance.txtA table listing the performance indicators of all created Maxent models [object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

item

  • ModelSelectionAICc_MarkedMaxAUCTest.png
  • ModelSelectionAICc_MarkedMinAICc.png
  • ModelSelectionAUCTest_MarkedMaxAUCTest.png
  • ModelSelectionAUCTest_MarkedMinAICc.png
  • ModelWithMaxAUCTest.txt
  • ModelWithMinAICc.txt
  • VariableSelectionProcess.txt
  • VariableSelectionMaxAUCTest.txt
  • VariableSelectionMinAICc.txt

code

VariableSelectionProcess.txt

describe

  • TestEither 'Contributions' or 'Correlation. Informs if the numbers for each of the environmental variables refers to model contribution coefficients or to correlation coefficients.
  • ModelThe unique model number (the same unique model number as in ModelPerformance.txt.)
  • betamultiplierThe (regularization multipliers) used to compile the respective model.
  • X'X' stands here for the name of an environmental variable. The Test row above informs whether the values in this row refer to the model contribution of this environmental variable or to its coefficient of correlation with another environmental variable. The variable to which it is compared is recognizable by a correlation coefficient of 1. If this environmental variable was excluded from the model, the value in this row is 'NA', which stands for 'Not Available'.'

Warning

Depending on the number of environmental variables and the range of different betamultipliers you want to test, variable selection can take several hours so that you might want to run the analysis over night.

Details

For further details on the model selection process and the variable settings, please have a look at the vignette that comes with this package.

References

Akaike H (1974) A new look at the statistical model identification IEEE Transactions on Automatic Control 19:6 716--723.

Fielding AH and Bell JF (1997) A review of methods for the assessment of prediction errors in conservation presence/absence models Environmental Conservation 24:1 38--49.

Examples

Run this code
# Please find a workflow tutorial in the vignette of this package. It
# will guide you through the settings and usage of the
# 'VariableSelection' function, the core function of this package.

Run the code above in your browser using DataLab