Learn R Programming

growthTrendR (version 0.2.2)

bam_spatial: spatial growth model for large dataset or vast geographical coverage

Description

To address the computational limitations of GAMMs for large datasets, this function offers a hybrid solution combining the efficiency of the mgcv::bam() function

Usage

bam_spatial(data, resp_scale = "resp_gamma", m.candidates)

Value

list including model, fitting statistics, ptable, stable and prediction table

Arguments

data

data containing all necessary columns to run the model

resp_scale

Character. Specifies how the response variable is treated in the model.

  • "resp_gaussian": Uses the response on its original scale, assuming a Gaussian distribution with an identity link (no transformation applied).

  • "resp_log": Log-transforms the response before modelling. The transformed response is then assumed to follow a Gaussian distribution with an identity link.

  • "resp_gamma": Keeps the response on its original scale, fitted under a Gamma distribution with a log link. Suitable for strictly positive and right-skewed data.

m.candidates

the list of candidate equations.

Details

This function accounts for within-site variability and temporal autocorrelation by including series identity as random effects and a first-order autoregressive (AR1) correlation structures, respectively. Among-site variability and spatial effects are captured by incorporating site identity as random effects. The model is refitted automatically by introducing a smooth term for latitude and longitude using the thin plate ("tp") basis if significant spatial autocorrelation persists. “Normalized” residuals are provided for future analysis.

This function supports parallel computation for the large-scale, geographically distributed datasets.

If users specify multiple candidate models through the m.candidates argument, the function will fit each candidate model using the maximum likelihood (ML) method. The corrected Akaike Information Criterion (AICc) will then be compared to determine the best-fitting model. Once the optimal model is identified, it will be refitted using the restricted maximum likelihood (REML) method and output the results.

If users specify only 1 candidate model through the m.candidates argument, the model is fitted with "REML" method.

Examples

Run this code
# \donttest{
# loading processed data
dt.samples_trt <- readRDS(system.file("extdata", "dt.samples_trt.rds", package = "growthTrendR"))
# climate
dt.clim <- data.table::fread(system.file("extdata", "dt.clim.csv", package = "growthTrendR"))
# pre-data for model
dt.samples_clim <- prepare_samples_clim(dt.samples_trt, dt.clim)
dt.m <- dt.samples_clim[ageC >1]
# bam_spatial model
m.sp <-bam_spatial(data = dt.m, resp_scale = "resp_log",
       m.candidates = c(
       "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + s(FFD)",
       "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC) + FFD",
       "bai_cm2 ~ log(ba_cm2_t_1) + s(ageC)"))
# }


Run the code above in your browser using DataLab