sar_threshold: Fit threshold SAR models

Description

Fit up to six piecewise (threshold) regression models to SAR data.

Usage

sar_threshold(data, mod = "All", interval = NULL, nisl = NULL,
  non_th_models = TRUE, logAxes = "area", con = 1, logT = log, parallel =
  FALSE, cores = NULL)

Value

A list of class "threshold" and "sars" with five elements. The first element contains the different model fits (lm objects). The second element contains the names of the fitted models, the third contains the threshold values, the fourth element the dataset (i.e. a dataframe with area and richness values), and the fifth contains details of any axes log-transformations undertaken. summary.sars provides a more user-friendly ouput (including a model summary table) and

plot.threshold plots the model fits.

Arguments

data: A dataset in the form of a dataframe with at least two columns: the first with island/site areas, and the second with the species richness of each island/site.
mod: A vector of model names: an individual model, a set of models, or all models. Can be any of 'All' (fit all models), 'ContOne' (continuous one-threshold), 'ZslopeOne' (left-horizontal one-threshold), 'DiscOne' (discontinuous one-threshold), 'ContTwo' (continuous two-threshold), 'ZslopeTwo' (left-horizontal two-threshold), or 'DiscTwo' (discontinuous two-threshold).
interval: The amount to increment the threshold value by in the iterative model fitting process (not applicable for the discontinuous models). The default for non-transformed area reverts to 1, while for log-transformed area it is 0.01. However, these values may not be suitable depending on the range of area values in a dataset, and thus users are advised to manually set this argument.
nisl: Set the minimum number of islands to be contained within each of the two segments (in the case of one-threshold models), or the first and last segments (in the case of two-threshold models). It needs to be less than than half of the total number of islands in the dataset. Default = NULL.
non_th_models: Logical argument (default = TRUE) of whether two non-threshold models (i.e. a simple linear regression: y ~ x; and an intercept only model: y ~ 1) should also be fitted.
logAxes: What log-transformation (if any) should be applied to the area and richness values. Should be one of "none" (no transformation), "area" (only area is log-transformed; default) or "both" (both area and richness log-transformed).
con: The constant to add to the species richness values in cases where one of the islands has zero species.
logT: The log-transformation to apply to the area and richness values. Can be any of log(default), log2 or log10.
parallel: Logical argument for whether parallel processing should be used. Only applicable when the continuous two-threshold and left-horizontal two-threshold models are being fitted.
cores: Number of cores to use. Only applicable when parallel = TRUE.

Author

Francois Rigal and Thomas J. Matthews

Details

This function is described in more detail in the accompanying paper (Matthews & Rigal, 2020).

Fitting the continuous and left-horizontal piecewise models (particularly the two-threshold models) can be time consuming if the range in area is large and/or the interval argument is small. For the two-threshold continuous slope and left-horizontal models, the use of parallel processing (using the parallel argument) is recommended. The number of cores (cores) must be provided.

Note that the interval argument is not used to fit discontinuous models, as, in these cases, the breakpoint must be at a datapoint.

There has been considerable debate regarding the number of parameters that are included in different piecewise models. Here (and thus in our calculation of AIC, AICc, BIC etc) we consider ContOne to have five parameters, ZslopeOne - 4, DiscOne - 6, ContTwo - 7, ZslopeTwo - 6, DiscTwo - 9. The standard linear model and the intercept model are considered to have 3 and 2 parameters, respectively. The raw lm model fits are provided in the output, which are also of class thresholdInt; this class has a method for the generic function logLik which adjusts the degrees of freedom associated with the log likelihood to match the number of parameters listed above.

The raw lm model fits can also be used to explore classic diagnostic plots for linear regression analysis in R using the function plot or other diagnostic tests such outlierTest, leveragePlots or influencePlot, available in the car package. This is advised as currently there are no model validation checks undertaken automatically, unlike elsewhere in the sars package.

Confidence intervals around the breakpoints in the one-threshold continuous and left- horizontal models can be calculated using the threshold_ci function. The intercepts and slopes of the different segments in the fitted breakpoint models can be calculated using the get_coef function.

Rarely, multiple breakpoint values can return the same minimum rss (for a given model fit). In these cases, we just randomly choose and return one and also produce a warning. If this occurs it is worth checking the data and model fits carefully.

The nisl argument can be useful to avoid situations where a segment contains only one island, for example. However, setting strict criteria on the number of data points to be included in segments could be seen as "forcing" the fit of the model, and arguably if a model fit is not interpretable, it is simply that the model does not provide a good representation of the data. Thus, it should not be used without careful thought.

References

Lomolino, M.V. & Weiser, M.D. (2001) Towards a more general species-area relationship: diversity on all islands, great and small. Journal of Biogeography, 28, 431-445.

Gao, D., Cao, Z., Xu, P. & Perry, G. (2019) On piecewise models and species-area patterns. Ecology and Evolution, 9, 8351-8361.

Matthews, T.J. et al. (2020) Unravelling the small-island effect through phylogenetic community ecology. Journal of Biogeography.

Matthews, T.J. & Rigal, F. (2021) Thresholds and the species–area relationship: a set of functions for fitting, evaluating and plotting a range of commonly used piecewise models. Frontiers of Biogeography, 13, e49404.

Examples

Run this code

data(aegean2)
a2 <- aegean2[1:168,]
fitT <- sar_threshold(data = a2, mod = c("ContOne", "DiscOne"), 
interval = 0.1, non_th_models = TRUE, logAxes = "area", logT = log10)
summary(fitT)
plot(fitT)
#diagnostic plots for the ContOne model
par(mfrow=c(2, 2))
plot(fitT[[1]][[1]])
#Plot multiple model fits in the same plot, with different colour for each 
#model fit
plot(fitT, multPlot = FALSE, lcol = c("yellow", "red", "green", "purple"))
#Making a legend. First extract the model order:
fitT[[2]]
#Then use the legend function - note you may need to play around with the 
#legend location depending on your data.
legend("topleft", legend=c("ContOne", "DiscOne","Linear", "Intercept"), fill =
c("yellow", "red", "green", "purple"))

Run the code above in your browser using DataLab