Learn R Programming

⚠️There's a newer version (2024.10.01) of this package.Take me there.

gbm.auto

Automatically runs numerous processes from R packages ‘gbm’ and ‘dismo’ and script ‘gbm.utils.R’ which contains Elith et al.’s functions: roc, calibration, and gbm.predict.grids, as well as running my packages gbm.bfcheck, gbm.basemap, gbm.map, gbm.rsb, gbm.cons, gbm.valuemap, and gbm.loop.

Especially on Linux systems it is recommended to type, in terminal: sudo apt install libgeos-dev sudo apt install libproj-dev sudo apt install libgdal-dev then manually install rgeos and rgdal in R/RStudio.

Also see each script’s Details section in the manual pages, as these frequently contain tips or common bugfixes.

I strongly recommend that you download papers 1 to 5 (or just the doctoral thesis) on http://www.simondedman.com, with emphasis on P4 (the guide) and P1 (statistical background). Elith et al 2008 (https://www.doi.org/10.1111/j.1365-2656.2008.01390.x) is also strongly recommended. Also it’s imperative you read the R help files for each function before you use them. In RStudio: Packages tab, scroll to gbm.auto, click its name, the click the function to see its man (manual) page. Read the whole thing. Function man pages can also be accessed from the console by typing

?function

Just because you CAN try every conceivable combination of tc, lr, bf, all, at once doesn’t mean you should. Try a range of lr in shrinking orders of magnitude from 0.1 to 0.000001, find the best, THEN try tc c(2, n.expvars), find the best THEN bf c(0.5, 0.75, 0.9) and then in between if either outperform 0.5.


gbm.auto

Automated Boosted Regression Tree modelling and mapping suite

Automates delta log normal boosted regression trees abundance prediction. Loops through all permutations of parameters provided (learning rate, tree complexity, bag fraction), chooses the best, then simplifies it. Generates line, dot and bar plots, and outputs these and the predictions and a report of all variables used, statistics for tests, variable interactions, predictors used and dropped, etc. If selected, generates predicted abundance maps, and Unrepresentativeness surfaces.


gbm.bfcheck

Calculates minimum Bag Fraction size for gbm.auto

Provides minimum bag fractions for gbm.auto, preventing failure due to bf & samples rows limit.


gbm.basemap

Creates Basemaps for Gbm.auto mapping from your data range

Downloads unzips crops & saves NOAAs global coastline shapefiles to user-set box. Use for ‘shape’ in gbm.map. If downloading in RStudio uncheck “Use secure download method for HTTP” in Tools > Global Options > Packages.


gbm.map

Maps of predicted abundance from Boosted Regression Tree modelling

Generates maps from the outputs of gbm.step then gbm.predict.grids, handled automatically within gbm.auto but can be run alone, and generates representativeness surfaces from the output of gbm.rsb.


gbm.rsb

Representativeness Surface Builder

Loops through explanatory variables comparing their histogram in ‘samples’ to their histogram in ‘grids’ to see how well the explanatory variable range in samples represents the range being predicted to in grids. Assigns a representativeness score per variable per site in grids, and takes the average score per site if there’s more than 1 expvar. Saves this to a CSV; it’s plotted by gbm.map if called in gbm.auto. This shows you which areas have the most and least representative coverage by samples, therefore where you can have the most/least confidence in the predictions from gbm.predict.grids. Can be called directly, and choosing a subset of expvars allows one to see their individual / collective representativeness.


gbm.cons

Conservation Area Mapping

Runs gbm.auto for multiple subsets of the same overall dataset and scales the combined results, leading to maps which highlight areas of high conservation importance for multiple species in the same study area e.g. using juvenile and adult female subsets to locate candidate nursery grounds and spawning areas respectively.


gbm.valuemap

Decision Support Tool that generates (Marine) Protected Area options using species predicted abundance maps

Scales response variable data, maps a user-defined explanatory variable to be avoided, e.g. fishing effort, combines them into a map showing areas to preferentially close. Bpa, the precautionary biomass required to protect the spawning stock, is used to calculate MPA size. MPA is then grown to add subsequent species starting from the most conservationally at-risk species, resulting in one MPA map per species, and a multicolour MPA map of all. All maps list the percentage of the avoid-variables total that is overlapped by the MPA in the map legend.


gbm.loop

Calculate Coefficient Of Variation surfaces for gbm.auto predictions

Processes a user-specified number of loops through the same gbm.auto parameter combinations and calculates the Coefficient Of Variation in the predicted abundance scores for each site aka cell. This can be mapped to spatially demonstrate the output variance range.


gbm.factorplot

ggplot-based update to PDP for factorial/categorical/character variables, allows changing order of categorical variables, and changing angle of x-axis labels to avoid them being cut off.


lmplot

Linear plot of two variables.


gbm.lmplots

Loops through lmplots for all expvars (x) against the same resvar (y).


roc & calibration

Internal functions authored by Elith & Leathwick, used by gbm.auto.R


gbm.step.sd

Local copy of dismo’s gbm.step, with added functions to generate model evaluation metrics such as root mean squared error and amount of deviance explained relative to null.


Installation

You can install the released version of gbm.auto from CRAN with:

install.packages("gbm.auto")

And the development version from GitHub with:

# install.packages("devtools")
remotes::install_github("SimonDedman/gbm.auto")

Example

(See each function’s help file for specific examples, and the documents listed above)


ToDo List

See GitHub issues section https://github.com/SimonDedman/gbm.auto/issues Feel free to contribute to this!

Copy Link

Version

Install

install.packages('gbm.auto')

Monthly Downloads

244

Version

2023.06.13

License

MIT + file LICENSE

Maintainer

Simon Dedman

Last Published

June 14th, 2023

Functions in gbm.auto (2023.06.13)

gbm.cons

Conservation Area Mapping
gbm.basemap

Creates Basemaps for Gbm.auto mapping from your data range
AllPreds_E

Data: Predicted abundances of 4 ray species generated using gbm.auto
gbm.subset

Subset gbm.auto input datasets to 2 groups using the partial deviance plots
gbm.map

Maps of predicted abundance from Boosted Regression Tree modelling
gbm.loop

Calculate Coefficient Of Variation surfaces for gbm.auto predictions
lmplot

Plot linear model for two variables with R2 & P printed and saved
gbm.valuemap

Decision Support Tool that generates (Marine) Protected Area options using species predicted abundance maps
gbm.step.sd

Function to assess optimal no of boosting trees using k-fold cross validation
gbm.lmplots

Plot linear models for all expvar against the resvar
gbm.rsb

Representativeness Surface Builder
roc

roc
grids

Data: Explanatory variables for rays in the Irish Sea
samples

Data: Numbers of 4 ray species caught in 2137 Irish Sea trawls, 1994 to 2014
breaks.grid

Defines breakpoints for draw.grid and legend.grid; mapplots fork
gbm.auto

Automated Boosted Regression Tree modelling and mapping suite
Juveniles

Data: Explanatory and response variables for 4 juvenile rays in the Irish Sea
gbm.bfcheck

Calculates minimum Bag Fraction size for gbm.auto
calibration

calibration
AllScaledData

Data: Scaled abundance data for 2 subsets of 4 rays in the Irish Sea, by gbm.cons
Adult_Females

Data: Numbers of 4 adult female rays caught in 2137 Irish Sea trawls, 1994 to 2014