Learn R Programming

CAST (version 0.4.0)

aoa: Area of Applicability

Description

This function estimates the Area of Applicability Index (AOAI) and the derived Area of Applicability (AOA) of spatial prediction models by considering the distance of new data (i.e. a Raster Stack of spatial predictors used in the models) in the predictor variable space to the data used for model training. Predictors can be weighted in the ideal case based on the internal variable importance of the machine learning algorithm used for model training.

Usage

aoa(train, predictors, weight = NA, model = NA, variables = "all",
  clstr = NULL, cl = NULL)

Arguments

train

a data.frame containing the data used for model training

predictors

A RasterStack, RasterBrick or data.frame containing the data the model was meant to make predictions for.

weight

A data.frame containing weights for each variable. Only required if no model is given.

model

A train object created with caret used to extract weights from (based on variable importance)

variables

character vector of predictor variables. if "all" then all variables of the train dataset are used. Check varImp(model).

clstr

Numeric or character. Spatial cluster affiliation for each data point. Should be used if replicates are present.

cl

Cluster object created with parallel::makeCluster. To run things in parallel.

Value

A RasterStack or data.frame with the AOAI and AOA

Details

The Area of Applicability Index (AOAI) and the corresponding Area of Applicability (AOA) are calculated. Interpretation of results: If a location is very similar to the properties of the training data it will have a low distance in the predictor variable space (AOAI towards 0) while locations that are very different in their properties will have a low Applicability Index. The AOAI is returned as inverse distance scaled by the average mean distance between training data points. The further the distance in this predicor space, the lower the AOAI gets. To get the AOA, a threshold to the AOAI is applied based on the mean+sd minimum distances between training data. See Meyer et al. (submitted) for the full documentation of the methodology.

References

Meyer, H., Pebesma, E. (submitted): Predicting into unknown space? Estimating the area of applicability of spatial prediction models.

Examples

Run this code
# NOT RUN {
library(sf)
library(raster)
library(caret)
library(viridis)
library(latticeExtra)

# prepare sample data:
dat <- get(load(system.file("extdata","Cookfarm.RData",package="CAST")))
dat <- aggregate(dat[,c("VW","Easting","Northing")],by=list(as.character(dat$SOURCEID)),mean)
pts <- st_as_sf(dat,coords=c("Easting","Northing"))
pts$ID <- 1:nrow(pts)
studyArea <- stack(system.file("extdata","predictors_2012-03-25.grd",package="CAST"))[[1:8]]
trainDat <- extract(studyArea,pts,df=TRUE)
trainDat <- merge(trainDat,pts,by.x="ID",by.y="ID")

# visualize data spatially:
spplot(scale(studyArea))
plot(studyArea$DEM)
plot(pts[,1],add=TRUE,col="black")

# first calculate the AOAI based on a set of variables with equal weights:
variables <- c("DEM","Easting","Northing")
AOA <- aoa(trainDat,studyArea,variables=variables)
spplot(AOA$AOAI, col.regions=viridis(100),main="Applicability Index")
spplot(AOA$AOA,col.regions=c("grey","transparent"),main="Area of Applicability")

# or weight variables based on variable improtance from a trained model:
set.seed(100)
model <- train(trainDat[,which(names(trainDat)%in%variables)],
trainDat$VW,method="rf",importance=TRUE,tuneLength=1)
print(model) #note that this is a quite poor prediction model
prediction <- predict(studyArea,model)
plot(varImp(model,scale=FALSE))
#
AOA <- aoa(trainDat,studyArea,model=model,variables=variables)
spplot(AOA$AOAI, col.regions=viridis(100),main="Applicability Index")
#plot predictions for the AOA only:
spplot(prediction, col.regions=viridis(100),main="prediction for the AOA")+
spplot(AOA$AOA,col.regions=c("grey","transparent"))
# }

Run the code above in your browser using DataLab