Learn R Programming

wpeR (version 0.1.0)

org_fams: Organize animals into families and expand pedigree data

Description

Takes pedigree data from get_colony() or get_ped() function and groups animals into families. It also expands the pedigree data by adding information about the family that each individual was born in and the family in which the individual is the reproductive animal.

Usage

org_fams(ped, sampledata, output = "both")

Value

Depending on the output parameter, the function returns either a data frame (ped or fams) or a list containing both data frames (ped and fams).

  • ped data frame. An extended version of the pedigree data from get_colony()/get_ped(). In addition to common pedigree information (individual, mother, father, sex, family), ped includes columns for:

    • parents: Identifier codes of both parents separated with _.

    • FamID: Numeric identifier for the family to which the individual belongs (see fams below).

    • FirstSeen: Date of first sample of individual.

    • LastSeen: Date of last sample of individual.

    • IsDead: Logical value (TRUE/FALSE) that identifies if the individual is dead.

    • DadHSgroup: Identifier of paternal half-sib group (see Details).

    • MomHSgroup: Identifier of maternal half-sib group (see Details).

    • hsGroup: Numeric value indicating if the individual is part of a half-sib group (see Details).

  • fams data frame includes information on families that individuals in the pedigree belong to. The families are described by:

    • parents: Identifier codes of both parents separated with _.

    • father: Identifier code of the father.

    • mother: Identifier code of the mother.

    • FamID: Numeric identifier for the family.

    • famStart: Date when the first sample of one of the offspring from this family was collected (see Details).

    • famEnd: Date when the last sample of mother or father of this family was collected (see Details).

    • FamDead: Logical value (TRUE/FALSE) indicating if the family no longer exists.

    • DadHSgroup: Identifier connecting families that share the same father.

    • MomHSgroup: Identifier connecting families that share the same mother.

    • hsGroup: Numeric value connecting families that share one of the parents.

Arguments

ped

Data frame. FamAgg output of get_colony() or get_ped() function. With rm_obsolete_parents parameter set to TRUE.

sampledata

Data frame. Metadata for all genetic samples that belong to the individuals included in pedigree reconstruction analysis. This data frame should adhere to the formatting and naming conventions outlined in the check_sampledata() documentation.

output

Character string. Determines the format of the output. Options are: "ped": returns an extended pedigree data frame. "fams": returns a table of all families present in the pedigree. "both": returns a list with two data frames: "ped" and "fams". (Default)

Details

The result of org_fams() function introduces us to two important concepts within the context of this package: family and half-sib group. A family in the output of this function is defined as a group of animals where at least one parent and at least one offspring is known. A half-sib group refers to a group of half-siblings, either maternally or paternally related. In the function output the DadHSgroup groups paternal half-siblings and MomHSgroup maternal half-siblings.

The fams output dataframe contains famStart and famEnd columns, which estimate a time window for the family based solely on sample collection dates provided in sampledata. famStart marks the date of the earliest sample collected from any offspring belonging to that family. famEnd indicates the date of the latest sample collected from either the mother or the father of that family. It is important to recognize that this method relies on observation (sampling) times. Consequently, famEnd (last parental sample date) can precede famStart (first offspring sample date), creating a biologically impossible sequence and a negative calculated family timespan. Users should interpret the interval between famStart and famEnd with this understanding.

Examples

Run this code

# Prepare the data for usage with org_fams() function.
# Get animal timespan data using the anim_timespan() function.
animal_ts <- anim_timespan(
  wolf_samples$AnimalRef,
  wolf_samples$Date,
  wolf_samples$SType,
  dead = c("Tissue")
)
# Add animal timespan to the sampledata
sampledata <- merge(wolf_samples, animal_ts, by.x = "AnimalRef", by.y = "ID", all.x = TRUE)
# Define the path to the pedigree data file.
path <- paste0(system.file("extdata", package = "wpeR"), "/wpeR_samplePed")
# Retrieve the pedigree data from the get_colony function.
ped_colony <- get_colony(path, sampledata, rm_obsolete_parents = TRUE, out = "FamAgg")

# Run the function
# Organize families and expand pedigree data using the org_fams function.
org_fams(
    ped = ped_colony,
    sampledata = sampledata
    )


Run the code above in your browser using DataLab