historicalize3: Create Historical Vertical Data Frame From Ahistorical Vertical Data Frame

Description

historicalize3() returns a vertically formatted demographic data frame organized to create historical projection matrices, given a vertically but ahistorically formatted data frame. This data frame is in standard lefko3 format and can be used in all functions in the package.

Usage

historicalize3(
  data,
  popidcol = 0,
  patchidcol = 0,
  individcol,
  year2col = 0,
  year3col = 0,
  xcol = 0,
  ycol = 0,
  sizea2col = 0,
  sizea3col = 0,
  sizeb2col = 0,
  sizeb3col = 0,
  sizec2col = 0,
  sizec3col = 0,
  repstra2col = 0,
  repstra3col = 0,
  repstrb2col = 0,
  repstrb3col = 0,
  feca2col = 0,
  feca3col = 0,
  fecb2col = 0,
  fecb3col = 0,
  indcova2col = 0,
  indcova3col = 0,
  indcovb2col = 0,
  indcovb3col = 0,
  indcovc2col = 0,
  indcovc3col = 0,
  alive2col = 0,
  alive3col = 0,
  dead2col = 0,
  dead3col = 0,
  obs2col = 0,
  obs3col = 0,
  nonobs2col = 0,
  nonobs3col = 0,
  repstrrel = 1,
  fecrel = 1,
  stage2col = 0,
  stage3col = 0,
  juv2col = 0,
  juv3col = 0,
  stageassign = NA,
  stagesize = NA,
  censor = FALSE,
  censorcol = 0,
  censorkeep = 0,
  spacing = NA,
  NAas0 = FALSE,
  NRasRep = FALSE,
  reduce = TRUE
)

Arguments

data

The horizontal data file.

popidcol

A variable name or column number corresponding to the identity of the population for each individual.

patchidcol

A variable name or column number corresponding to the identity of the patch for each individual, if patches have been designated within populations.

individcol

A variable name or column number corresponding to the unique identity of each individual.

year2col

A variable name or column number corresponding to occasion t (year or time).

year3col

A variable name or column number corresponding to occasion t+1 (year or time).

xcol

A variable name or column number corresponding to the x coordinate of each individual in Cartesian space.

ycol

A variable name or column number corresponding to the y coordinate of each individual in Cartesian space.

sizea2col

A variable name or column number corresponding to the primary size entry in occasion t.

sizea3col

A variable name or column number corresponding to the primary size entry in occasion t+1.

sizeb2col

A variable name or column number corresponding to the secondary size entry in occasion t.

sizeb3col

A variable name or column number corresponding to the secondary size entry in occasion t+1.

sizec2col

A variable name or column number corresponding to the tertiary size entry in occasion t.

sizec3col

A variable name or column number corresponding to the tertiary size entry in occasion t+1.

repstra2col

A variable name or column number corresponding to the production of reproductive structures, such as flowers, in occasion t. This can be binomial or count data, and is used to in analysis of the probability of reproduction.

repstra3col

A variable name or column number corresponding to the production of reproductive structures, such as flowers, in occasion t+1. This can be binomial or count data, and is used to in analysis of the probability of reproduction.

repstrb2col

A second variable name or column number corresponding to the production of reproductive structures, such as flowers, in occasion t. This can be binomial or count data.

repstrb3col

A second variable name or column number corresponding to the production of reproductive structures, such as flowers, in occasion t+1. This can be binomial or count data.

feca2col

A variable name or column number corresponding to fecundity in occasion t. This may represent egg counts, fruit counts, seed production, etc.

feca3col

A variable name or column number corresponding to fecundity in occasion t+1. This may represent egg counts, fruit counts, seed production, etc.

fecb2col

A second variable name or column number corresponding to fecundity in occasion t. This may represent egg counts, fruit counts, seed production, etc.

fecb3col

A second variable name or column number corresponding to fecundity in occasion t+1. This may represent egg counts, fruit counts, seed production, etc.

indcova2col

A variable name or column number corresponding to an individual covariate to be used in analysis, in occasion t.

indcova3col

A variable name or column number corresponding to an individual covariate to be used in analysis, in occasion t+1.

indcovb2col

A second variable name or column number corresponding to an individual covariate to be used in analysis, in occasion t.

indcovb3col

A second variable name or column number corresponding to an individual covariate to be used in analysis, in occasion t+1.

indcovc2col

A third variable name or column number corresponding to an individual covariate to be used in analysis, in occasion t.

indcovc3col

A third variable name or column number corresponding to an individual covariate to be used in analysis, in occasion t+1.

alive2col

A variable name or column number that provides information on whether an individual is alive in occasion t. If used, living status must be designated as binomial (living = 1, dead = 0).

alive3col

A variable name or column number that provides information on whether an individual is alive in occasion t+1. If used, living status must be designated as binomial (living = 1, dead = 0).

dead2col

A variable name or column number that provides information on whether an individual is dead in occasion t. If used, dead status must be designated as binomial (dead = 1, living = 0).

dead3col

A variable name or column number that provides information on whether an individual is dead in occasion t+1. If used, dead status must be designated as binomial (dead = 1, living = 0).

obs2col

A variable name or column number providing information on whether an individual is in an observable stage in occasion t. If used, observation status must be designated as binomial (observed = 1, not observed = 0).

obs3col

A variable name or column number providing information on whether an individual is in an observable stage in occasion t+1. If used, observation status must be designated as binomial (observed = 1, not observed = 0).

nonobs2col

A variable name or column number providing information on whether an individual is in an unobservable stage in occasion t. If used, observation status must be designated as binomial (not observed = 1, observed = 0).

nonobs3col

A variable name or column number providing information on whether an individual is in an unobservable stage in occasion t+1. If used, observation status must be designated as binomial (not observed = 1, observed = 0).

repstrrel

This is a scalar multiplier to make the variable represented by repstrb2col equivalent to the variable represented by repstra2col. This can be useful if two reproductive status variables have related but unequal units, for example if repstrb2col refers to one-flowered stems while repstra2col refers to two-flowered stems.

fecrel

This is a scalar multiplier that makes the variable represented by fecb2col equivalent to the variable represented by feca2col. This can be useful if two fecundity variables have related but unequal units.

stage2col

Optional variable name or column number corresponding to life history stage in occasion t.

stage3col

Optional variable name or column number corresponding to life history stage in occasion t+1.

juv2col

A variable name or column number that marks individuals in immature stages in occasion t. The historicalize3() function assumes that immature individuals are identified in this variable marked with a number equal to or greater than 1, and that mature individuals are marked as 0 or NA.

juv3col

A variable name or column number that marks individuals in immature stages in occasion t+1. The historicalize3() function assumes that immature individuals are identified in this variable marked with a number equal to or greater than 1, and that mature individuals are marked as 0 or NA.

stageassign

The stageframe object identifying the life history model being operationalized. Note that if stage2col is provided, then this stageframe is not utilized in stage designation.

stagesize

A variable name or column number describing which size variable to use in stage estimation. Defaults to NA, and can also take sizea, sizeb, sizec, or sizeadded, depending on which size variable is chosen.

censor

A logical variable determining whether the output data should be censored using the variable defined in censorcol. Defaults to FALSE.

censorcol

A variable name or column number corresponding to a censor variable within the dataset, used to distinguish between entries to use and those to discard from analysis, or to designate entries with special issues that require further attention.

censorkeep

The value of the censoring variable identifying data that should be included in analysis. Defaults to 0, but may take any value including NA.

spacing

The spacing at which density should be estimated, if density estimation is desired and x and y coordinates are supplied. Given in the same units as those used in the x and y coordinates given in xcol and ycol. Defaults to NA.

NAas0

If TRUE, then all NA entries for size and fecundity variables will be set to 0. This can help increase the sample size analyzed by modelsearch(), but should only be used when it is clear that this substitution is biologically realistic. Defaults to FALSE.

NRasRep

If set to TRUE, then this function will treat non-reproductive but mature individuals as reproductive during stage zssignment. This can be useful when a matrix is desired without separation of reproductive and non-reproductive but mature stages of the same size. Only used if stageassign is set to a stageframe. Defaults to FALSE.

reduce

A logical variable determining whether unused variables and some invariant state variables should be removed from the output dataset. Defaults to TRUE.

Value

If all inputs are properly formatted, then this function will output a historical vertical data frame (class hfvdata), meaning that the output data frame will have three consecutive years of size and reproductive data per individual per row. This data frame is in standard format for all functions used in lefko3, and so can be used without further modification. Note that determination of state in occasions *t*-1 and *t*+1 gives preference to condition in occasion *t* within the input dataset. Conflicts in condition in input datasets that have both occasions *t* and *t*+1 listed per row are resolved by using condition in occasion *t*.

Variables in this data frame include the following:

rowid

Unique identifier for the row of the data frame.

popid

Unique identifier for the population, if given.

patchid

Unique identifier for patch within population, if given.

individ

Unique identifier for the individual.

year2

Year or time in occasion t.

firstseen

Occasion of first observation.

lastseen

Occasion of last observation.

obsage

Observed age in occasion t, assuming first observation corresponds to age = 0.

obslifespan

Observed lifespan, given as lastseen - firstseen + 1.

xpos1,xpos2,xpos3

X position in Cartesian space in occasions t-1, t, and t+1, respectively, if provided.

ypos1,ypos2,ypos3

Y position in Cartesian space in occasions t-1, t, and t+1, respectively, if provided.

sizea1,sizea2,sizea3

Main size measurement in occasions t-1, t, and t+1, respectively.

sizeb1,sizeb2,sizeb3

Secondary size measurement in occasions t-1, t, and t+1, respectively.

sizec1,sizec2,sizec3

Tertiary measurement in occasions t-1, t, and t+1, respectively.

size1added,size2added,size3added

Sum of primary, secondary, and tertiary size measurements in occasions t-1, t, and t+1, respectively.

repstra1,repstra2,repstra3

Main numbers of reproductive structures in occasions t-1, t, and t+1, respectively.

repstrb1,repstrb2,repstrb3

Secondary numbers of reproductive structures in occasions t-1, t, and t+1, respectively.

repstr1added,repstr2added,repstr3added

Sum of primary and secondary reproductive structures in occasions t-1, t, and t+1, respectively.

feca1,feca2,feca3

Main numbers of offspring in occasions t-1, t, and t+1, respectively.

fecb1,fecb2, fecb3

Secondary numbers of offspring in occasions t-1, t, and t+1, respectively.

fec1added,fec2added,fec3added

Sum of primary and secondary fecundity in occasions t-1, t, and t+1, respectively.

censor1,censor2,censor3

Censor state values in occasions t-1, t, and t+1, respectively.

juvgiven1,juvgiven2,juvgiven3

Binomial variable indicating whether individual is juvenile in occasions t-1, t, and t+1. Only given if juvcol is provided.

obsstatus1,obsstatus2,obsstatus3

Binomial observation state in occasions t-1, t, and t+1, respectively.

repstatus1,repstatus2,repstatus3

Binomial reproductive state in occasions t-1, t, and t+1, respectively.

fecstatus1,fecstatus2,fecstatus3

Binomial offspring production state in occasions t-1, t, and t+1, respectively.

matstatus1,matstatus2,matstatus3

Binomial maturity state in occasions t-1, t, and t+1, respectively.

alive1,alive2,alive3

Binomial state as alive in occasions t-1, t, and t+1, respectively.

density

Density of individuals per unit designated in spacing. Only given if spacing is not NA.

Notes

In some datasets on species with unobserveable stages, observation status (obsstatus) might not be inferred properly if a single size variable is used that does not yield sizes greater than 0 in all cases in which individuals were observed. Such situations may arise, for example, in plants when leaf number is the dominant size variable used, but individuals occasionally occur with inflorescences but no leaves. In this instances, it helps to mark related variables as sizeb and sizec, because observation status will be interpreted in relation to all 3 size variables. Further analysis can then utilize only a single size variable, of the user's choosing. Similar issues can arise in reproductive status (repstatus).

Warnings that some individuals occur in state combinations that do not match any stages in the stageframe used to assign stages are common when first working with a dataset. Typically, these situations can be identified as NoMatch entries in stage3, although such entries may crop up in stage1 and stage2, as well. In rare cases, these warnings will arise with no concurrent NoMatch entries, which indicates that the input dataset contained conflicting state data at once suggesting that the individual is in some stage but is also dead. The latter is removed if the conflict occurs in occasion t or occasion t-1, as only living entries are allowed in these times.

Care should be taken to avoid variables with negative values indicating size, fecundity, or reproductive or observation status. Negative values can be interpreted in different ways, typically reflecting estimation through other algorithms rather than actual measured data. Variables holding negative values can conflict with data management algorithms in ways that are difficult to predict.

Examples

Run this code

# NOT RUN {
data(cypvert)

sizevector <- c(0, 0, 0, 0, 0, 0, 1, 2.5, 4.5, 8, 17.5)
stagevector <- c("SD", "P1", "P2", "P3", "SL", "D", "XSm", "Sm", "Md", "Lg",
  "XLg")
repvector <- c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1)
obsvector <- c(0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1)
matvector <- c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
immvector <- c(0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0)
propvector <- c(1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
indataset <- c(0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1)
binvec <- c(0, 0, 0, 0, 0, 0.5, 0.5, 1, 1, 2.5, 7)

cypframe_raw <- sf_create(sizes = sizevector, stagenames = stagevector, 
  repstatus = repvector, obsstatus = obsvector, matstatus = matvector, 
  propstatus = propvector, immstatus = immvector, indataset = indataset,
  binhalfwidth = binvec)

cypframe_raw

cypraw_v2 <- historicalize3(data = cypvert, patchidcol = "patch", 
  individcol = "plantid", year2col = "year2", sizea2col = "Inf2.2", 
  sizea3col = "Inf2.3", sizeb2col = "Inf.2", sizeb3col = "Inf.3", 
  sizec2col = "Veg.2", sizec3col = "Veg.3", repstra2col = "Inf2.2", 
  repstra3col = "Inf2.3", repstrb2col = "Inf.2", repstrb3col = "Inf.3", 
  feca2col = "Pod.2", feca3col = "Pod.3", repstrrel = 2, 
  stageassign = cypframe_raw, stagesize = "sizeadded", censorcol = "censor",
  censor = FALSE, NAas0 = TRUE, NRasRep = TRUE, reduce = TRUE)
  
cypsupp2r <- supplemental(stage3 = c("SD", "P1", "P2", "P3", "SL", "D", 
    "XSm", "Sm", "SD", "P1"),
  stage2 = c("SD", "SD", "P1", "P2", "P3", "SL", "SL", "SL", "rep",
    "rep"),
  eststage3 = c(NA, NA, NA, NA, NA, "D", "XSm", "Sm", NA, NA),
  eststage2 = c(NA, NA, NA, NA, NA, "XSm", "XSm", "XSm", NA, NA),
  givenrate = c(0.10, 0.20, 0.20, 0.20, 0.25, NA, NA, NA, NA, NA),
  multiplier = c(NA, NA, NA, NA, NA, NA, NA, NA, 0.5, 0.5),
  type =c(1, 1, 1, 1, 1, 1, 1, 1, 3, 3),
  stageframe = cypframe_raw, historical = FALSE)

cypmatrix2r <- rlefko2(data = cypraw_v2, stageframe = cypframe_raw, 
  year = "all", patch = "all", stages = c("stage3", "stage2"),
  size = c("size3added", "size2added"), supplement = cypsupp2r,
  yearcol = "year2", patchcol = "patchid", indivcol = "individ")
  
cypmatrix2r$A[[intersect(which(cypmatrix2r$labels$patch == "A"), 
  which(cypmatrix2r$labels$year2 == 2004))]]

lambda3(cypmatrix2r)

# }

Run the code above in your browser using DataLab