spaMM_boot: Parametric bootstrap

Description

This simulates samples from a fit object inheriting from class "HLfit", as produced by spaMM's fitting function, and applies a given function to each simulated sample. Parallelization is supported (see Details). A typical usage of the parametric bootstrap is to fit by one model samples produced under another model (see Example). spaMM_boot provides more control on what is computed on each bootstrap replicate than the bootstrap procedure in functions for likelihood ratio tests.

Usage

spaMM_boot(object, simuland, nsim, nb_cores = NULL, resp_testfn=NULL, 
           control.foreach=list(), ...)

Arguments

object

The fit object to simulate from.

simuland

The function to apply to each simulated sample. See Details for requirements of this function.

nsim

Number of samples to simulate and analyze.

nb_cores

Number of cores to use for parallel computation. The default is spaMM.getOption("nb_cores"), and 1 if the latter is NULL. nb_cores=1 prevents the use of parallelisation procedures.

resp_testfn

Passed to simulate.HLfit; NULL, or a function that tests a condition which simulated samples should satisfy. This function takes a response vector as argument and return a boolean (TRUE indicating that the sample satisfies the condition).

control.foreach

list of control arguments for foreach. These include in particular .combine (with default value "rbind"), and .errorhandling (with default value "remove", but "pass" is quite useful for debugging).

…

Further arguments passed to the simuland function.

Value

A list with two elements:

bootreps, nsim return values in the format returned either by apply or parallel::parApply or by foreach::`%dopar%` as controlled by control.foreach$.combine. If simuland returns a vector, spaMM_boot should effectively rbind the results by default, returning an nsim-row matrix in all cases. From spaMM 2.5.6, if simuland returns a 1-row data frame, spaMM_boot rbinds the results into a nsim-row data frame in all cases. The results may not be consistent among parallel backends in other cases, and may change in later versions, so users should stick to one of these two cases as much as possible.
RNGstate, the state of .Random.seed at the beginning of the simulation.

Details

The simuland function must take as first argument a vector of response values, and must have a … argument. spaMM_boot calls simulate.HLfit on the fit object and applies simuland on each column of the matrix returned by this call.

This function handles parallel backends with different features. pbapply::pbapply has a very simple interface (essentially equivalent to apply) and provides progress bars, but (currently: version 1.3.4) does not have efficient load-balancing. doSNOW also provides a progress bar and allows more efficient load-balancing, but its requires foreach, whose handling of '…' arguments is tortuous. foreach will be used if doSNOW is loaded; then, some of the '…' arguments may need to be quoted (see Example). foreach also handles errors diferently from pbapply (which will simply stop if fitting a model to a bootstrap replicate fails): see the foreach documentation.

Examples

Run this code

# NOT RUN {
if (spaMM.getOption("example_maxtime")>10) {
 data("blackcap")
 
 # Generate fits of null and full models:
 lrt <- fixedLRT(null.formula=migStatus ~ 1 + Matern(1|latitude+longitude),
       formula=migStatus ~ means + Matern(1|latitude+longitude), 
       HLmethod='ML',data=blackcap)

 # The 'simuland' argument: 
 myfun <- function(y, what=NULL, lrt, ...) { 
    data <- lrt$fullfit$data
    data$migStatus <- y ## replaces original response (! more complicated for binomial fits)
    full_call <- getCall(lrt$fullfit) ## call for full fit
    full_call$data <- data
    res <- eval(full_call) ## fits the full model on the simulated response
    if (!is.null(what)) res <- eval(what) ## post-process the fit
    return(res) ## the fit, or anything produced by evaluating 'what'
  }
  # where the 'what' argument (not required) of myfun() allows one to control 
  # what the function returns without redefining the function.
  
  # Call myfun() with no 'what' argument: returns a list of fits 
  spaMM_boot(lrt$nullfit, simuland = myfun, nsim=1, lrt=lrt)[["bootreps"]] 
  
  # Return only a model coefficient for each fit: 
  spaMM_boot(lrt$nullfit, simuland = myfun, nsim=7,
               what=quote(fixef(res)[2L]), lrt=lrt)[["bootreps"]]       
}
# }

Run the code above in your browser using DataLab