modelsearch: Develop Best-fit Vital Rate Estimation Models For Matrix Development

Description

modelsearch() returns both a table of vital rate estimating models and a best-fit model for each major vital rate estimated. The final output can be used as input in other functions within this package.

Usage

modelsearch(
  data,
  historical = TRUE,
  approach = "lme4",
  suite = "size",
  vitalrates = c("surv", "size", "fec"),
  juvestimate = NA,
  juvsize = FALSE,
  bestfit = "AICc&k",
  sizedist = "gaussian",
  fecdist = "gaussian",
  fectime = 2,
  censor = NA,
  indiv = "individ",
  patch = NA,
  year = "year2",
  surv = c("alive3", "alive2", "alive1"),
  obs = c("obsstatus3", "obsstatus2", "obsstatus1"),
  size = c("sizea3", "sizea2", "sizea1"),
  repst = c("repstatus3", "repstatus2", "repstatus1"),
  fec = c("feca3", "feca2", "feca1"),
  stage = c("stage3", "stage2", "stage1"),
  age = NA,
  year.as.random = TRUE,
  patch.as.random = TRUE,
  show.model.tables = TRUE,
  quiet = FALSE
)

Arguments

data

The vertical dataset to be used for analysis. This dataset should be of class hfvdata, but can also be a data frame formatted similarly to the output format provided by functions verticalize3() or historicalize3(), as long as all needed variables are properly designated.

historical

A logical variable denoting whether to assess the effects of state in time t-1 in addition to state in time t. Defaults to TRUE.

approach

The statistical approach to be taken for model building. The default is lme4, which uses the mixed model approach utilized in package lme4. Other options include glm, which uses lm, glm, and related functions in base R.

suite

This describes the global model for each vital rate estimation and has the following possible values: full, includes main effects and all two-way interactions of size and reproductive status; main, includes main effects only of size and reproductive status; size, includes only size (also interactions between size in historical model); rep, includes only reproductive status (also interactions between status in historical model); cons, all vital rates estimated only as y-intercepts. If approach = "glm" and year.as.random = FALSE, then year is also included as a fixed effect, and, in the case of full, included in two-way interactions. Defaults to size.

vitalrates

A vector describing which vital rates will be estimated via linear modeling, with the following options: surv, survival probability; obs, observation probability; size, overall size; repst, probability of reproducing; and fec, amount of reproduction (overall fecundity). Defaults to c("surv", "size", "fec").

juvestimate

An optional variable denoting the stage name of the juvenile stage in the vertical dataset. If not NA, and stage is also given (see below), then vital rates listed in vitalrates other than fec will also be estimated from the juvenile stage to all adult stages. Defaults to NA, in which case juvenile vital rates are not estimated.

juvsize

A logical variable denoting whether size should be used as a term in models involving transition from the juvenile stage. Defaults to FALSE, and is only used if juvestimate does not equal NA.

bestfit

A variable indicating the model selection criterion for the choice of best-fit model. The default is AICc&k, which chooses the best-fit model as the model with the lowest AICc or, if not the same model, then the model that has the lowest degrees of freedom among models with \(\Delta AICc <= 2.0\). Alternatively, AICc may be chosen, in which case the best-fit model is simply the model with the lowest AICc value.

sizedist

The probability distribution used to model size. Options include gaussian for the Normal distribution (default), poisson for the Poisson distribution, and negbin for the negative binomial distribution.

fecdist

The probability distribution used to model fecundity. Options include gaussian for the Normal distribution (default), poisson for the Poisson distribution, and negbin for the negative binomial distribution.

fectime

A variable indicating which year of fecundity to use as the response term in fecundity models. Options include 2, which refers to time t, and 3, which refers to time t+1. Defaults to 2.

censor

A vector denoting the names of censoring variables in the dataset, in order from time t+1, followed by time t, and lastly followed by time t-1. Defaults to NA.

indiv

A variable indicating the variable name coding individual identity. Defaults to individ.

patch

A variable indicating the variable name coding for patch, where patches are defined as permanent subgroups within the study population. Defaults to NA.

year

A variable indicating the variable coding for observation time in time t. Defaults to year2.

surv

A vector indicating the variable names coding for status as alive or dead in times t+1, t, and t-1, respectively. Defaults to c("alive3", "alive2", "alive1").

obs

A vector indicating the variable names coding for observation status in times t+1, t, and t-1, respectively. Defaults to c("obsstatus3", "obsstatus2", "obsstatus1").

size

A vector indicating the variable names coding for size in times t+1, t, and t-1, respectively. Defaults to c("sizea3", "sizea2", "sizea1").

repst

A vector indicating the variable names coding for reproductive status in times t+1, t, and t-1, respectively. Defaults to c("repstatus3", "repstatus2", "repstatus1").

fec

A vector indicating the variable names coding for fecundity in times t+1, t, and t-1, respectively. Defaults to c("feca3", "feca2", "feca1").

stage

A vector indicating the variables coding for stage in times t+1, t, and t-1. Defaults to c("stage3", "stage2", "stage1").

age

Designates the name of the variable corresponding to age in the vertical dataset. Defaults to NA, in which case age is not included in linear models. Should only be used if building age x stage matrices.

year.as.random

If set to TRUE and approach = "lme4", then year is included as a random factor. If set to FALSE, then year is included as a fixed factor. All other combinations of logical value and approach lead to year not being included in modeling. Defaults to TRUE.

patch.as.random

If set to TRUE and approach = "lme4", then patch is included as a random factor. If set to FALSE and approach = "glm", then patch is included as a fixed factor. All other combinations of logical value and approach lead to patch not being included in modeling. Defaults to TRUE.

show.model.tables

If set to TRUE, then includes full modeling tables in the output. Defaults to TRUE.

quiet

If set to TRUE, then model building and selection will proceed without warnings and diagnostic messages being issued. Note that this will not affect warnings and messages generated as models themselves are tested. Defaults to FALSE.

Value

This function yields an object of class lefkoMod, which is a list in which the first 9 elements are the best-fit models for survival, observation status, size, reproductive status, fecundity, juvenile survival, juvenile observation, juvenile size, and juvenile transition to reproduction, respectively, followed by 9 elements corresponding to the model tables for each of these vital rates, in order, followed by a single character element denoting the criterion used for model selection, and ending on a quality control vector:

survival_model

Best-fit model of the binomial probability of survival from time t to time t+1. Defaults to 1.

observation_model

Best-fit model of the binomial probability of observation in time t+1 given survival to that time. Defaults to 1.

size_model

Best-fit model of size in time t+1 given survival to and observation in that time. Defaults to 1.

repstatus_model

Best-fit model of the binomial probability of reproduction in time t+1, given survival to and observation in that time. Defaults to 1.

fecundity_model

Best-fit model of fecundity in time t+1 given survival to, and observation and reproduction in that time. Defaults to 1.

juv_survival_model

Best-fit model of the binomial probability of survival from time t to time t+1 of an immature individual. Defaults to 1.

juv_observation_model

Best-fit model of the binomial probability of observation in time t+1 given survival to that time of an immature individual. Defaults to 1.

juv_size_model

Best-fit model of size in time t+1 given survival to and observation in that time of an immature individual. Defaults to 1.

juv_reproduction_model

Best-fit model of the binomial probability of reproduction in time t+1, given survival to and observation in that time of an individual that was immature in time t. This model is technically not a model of reproduction probability for individuals that are immature, rather reproduction probability here is given for individuals that are mature in time t+1 but immature in time t. Defaults to 1.

survival_table

Full dredge model table of survival probability.

observation_table

Full dredge model table of observationprobability.

size_table

Full dredge model table of size.

repstatus_table

Full dredge model table of reproduction probability.

fecundity_table

Full dredge model table of fecundity.

juv_survival_table

Full dredge model table of immature survival probability.

juv_observation_table

Full dredge model table of immature observation probability.

juv_size_table

Full dredge model table of immature size.

juv_reproduction_table

Full dredge model table of immature reproduction probability.

criterion

Vharacter variable denoting the criterion used to determine the best-fit model.

Data frame with three variables: 1) Name of vital rate, 2) number of individuals used to model that vital rate, and 3) number of individual transitions used to model that vital rate.

The mechanics governing model building are fairly robust to errors and exceptions. The function attempts to build global models, and simplifies models automatically should model building fail. Model selection proceeds via the dredge function in package MuMIn, and defaults to the global model should that fail.

This function is verbose, so that any errors and warnings developed during model building, model analysis, and model selection can be found and dealt with. Interpretations of errors during global model analysis may be found in documentation in base R for functions lm and glm used in analysis of models without random terms, and packages lme4 and glmmTMB for mixed models (see glmer and glmmTMB, respectively). Package MuMIn is used for model dredging (see dredge), and errors and warnings during dredging can be interpreted using the documentation for that package. The quiet = TRUE option can be used to silence dredge warnings, but users should note that automated model selection can be viewed as a black box, and so great care should be taken to ensure that the models run make biological sense, and that model quality is prioritized.

Care must be taken to build models that test the impacts of state in time t-1 for historical models, and that do not test these impacts for ahistorical models. Ahistorical matrix modeling particularly will yield biased transition estimates if historical terms from models are ignored. This can be dealt with at the start of modeling by setting historical = FALSE for the ahistorical case, and historical = TRUE for the historical case.

Model building and selection may fail if NAs exist within variables used in modeling. If NAs represent 0 entries, then change all NAs to 0, as with the NAas0 = TRUE option in function verticalize3().

Examples

Run this code

# NOT RUN {
data(lathyrus)

sizevector <- c(0, 4.6, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5, 6, 7, 8, 9)
stagevector <- c("Sd", "Sdl", "Dorm", "Sz1nr", "Sz2nr", "Sz3nr", "Sz4nr", "Sz5nr",
                 "Sz6nr", "Sz7nr", "Sz8nr", "Sz9nr", "Sz1r", "Sz2r", "Sz3r", "Sz4r",
                 "Sz5r", "Sz6r", "Sz7r", "Sz8r", "Sz9r")
repvector <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1)
obsvector <- c(0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
matvector <- c(0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
immvector <- c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
propvector <- c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
indataset <- c(0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
binvec <- c(0, 4.6, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5,
            0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5)

lathframeln <- sf_create(sizes = sizevector, stagenames = stagevector, repstatus = repvector,
                         obsstatus = obsvector, matstatus = matvector, immstatus = immvector,
                         indataset = indataset, binhalfwidth = binvec, propstatus = propvector)

lathvertln <- verticalize3(lathyrus, noyears = 4, firstyear = 1988, patchidcol = "SUBPLOT",
                           individcol = "GENET", blocksize = 9, juvcol = "Seedling1988",
                           size1col = "lnVol88", repstr1col = "FCODE88",
                           fec1col = "Intactseed88", dead1col = "Intactseed88",
                           nonobs1col = "Dormant1988", stageassign = lathframeln,
                           stagesize = "sizea", censorcol = "Missing1988",
                           censorkeep = NA, NAas0 = TRUE, censor = TRUE)

lathvertln$feca2 <- round(lathvertln$feca2)
lathvertln$feca1 <- round(lathvertln$feca1)
lathvertln$feca3 <- round(lathvertln$feca3)

lathmodelsln2 <- modelsearch(lathvertln, historical = FALSE, approach = "lme4", suite = "main",
                             vitalrates = c("surv", "obs", "size", "repst", "fec"), 
                             juvestimate = "Sdl", bestfit = "AICc&k", sizedist = "gaussian", 
                             fecdist = "poisson", indiv = "individ", patch = "patchid", 
                             year = "year2", year.as.random = TRUE, patch.as.random = TRUE,
                             show.model.tables = TRUE)

lathmodelsln2
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab