nprcgenekeepr v1.0.3
Monthly downloads
Genetic Tools for Colony Management
Provides genetic tools for colony management and is a derivation
of the work in Amanda Vinson and Michael J Raboin (2015)
<https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4671785/> "A Practical
Approach for Designing Breeding Groups to Maximize Genetic Diversity in a
Large Colony of Captive Rhesus Macaques ('Macaca' 'mulatto')".
It provides a 'Shiny' application with an exposed API.
The application supports five groups of functions:
(1) Quality control of studbooks contained in text files or 'Excel'
workbooks and of pedigrees within 'LabKey' Electronic Health Records
(EHR);
(2) Creation of pedigrees from a list of animals using the 'LabKey' EHR
integration;
(3) Creation and display of an age by sex pyramid plot of the living
animals within the designated pedigree;
(4) Generation of genetic value analysis reports; and
(5) Creation of potential breeding groups with and without proscribed sex
ratios and defined maximum kinships.
Readme
README
R. Mark Sharp 05/17/2020
nprcgenekeepr – Version 1.0.3 (20200526)
Introduction
The goal of nprcgenekeepr is to implement Genetic Tools for Colony Management. It was initially conceived and developed as a Shiny web application at the Oregon National Primate Research Center (ONPRC) to facilitate some of the analyses they perform regularly. It has been enhanced to have more capability as a Shiny application and to expose the functions so they can be used either interactively or in R scripts.
This work has been supported in part by NIH grants P51 RR13986 to the Southwest National Primate Research Center and P51 OD011092 to the Oregon National Primate Research Center.
At present, the application supports 5 functions:
- Quality control of studbooks contained in text files or Excel workbooks and of pedigrees within LabKey Electronic Health Records (EHR)
- Creation of pedigrees from a lists of animals using the LabKey EHR integration
- Creation and display of an age by sex pyramid plot of the living animals within the designated pedigree
- Generation of Genetic Value Analysis Reports
- Creation of potential breeding groups with and without proscribed sex ratios and defined maximum kinships.
For more information see:
A Practical Approach for Designing Breeding Groups to Maximize Genetic
Diversity in a Large Colony of Captive Rhesus Macaques (Macaca
mulatto) Vinson, A ; Raboin, MJ Journal Of The American Association
For Laboratory Animal Science, 2015 Nov, Vol.54(6), pp.700-707 [Peer
Reviewed Journal]
Installation
You can install the development version of nprcgenekeepr from GitHub from the R console prompt with:
install.packages("devtools")
devtools::install_github("rmsharp/nprcgenekeepr")
All missing dependencies should be automatically installed.
Online Documentation
You can find the complete online documentation at https://rmsharp.github.io/nprcgenekeepr/.
At the top of the page are three menus to the right of the Home icon:
Reference, Articles, and Changelog.
The Reference menu at the top of the page brings up the list of
documentation for Data objects, Major Features and Functions,
Primary interactive functions and All exposed functions.
The Articles menu brings up the list of vignettes, which are, except
for Development Plans, tutorials for using the package.
The Changelog brings up a copy of the NEWS file of the package, which
records the major changes made for each version.
Running Shiny Application
The toolset available within nprcgenekeepr can be used inside standard R scripts. However, it was originally designed to be used within a Shiny application that can be started with:
library(nprcgenekeepr)
runGeneKeepR()
Summary of Major Functions
Quality Control
Studbooks maintained by breeding colonies generally contain information of varying quality. The quality control functions of the toolkit check to ensure all animals listed as parents have their own line entries, all parents have the appropriate sex listed, no animals are listed as both a sire and a dam, duplicate entries are removed, pedigree generation numbers are added, and all dates are valid dates. In addition, exit dates are added if possible and are consistent with other information such as departure dates and death dates. Current ages of animals that are still alive are added if a database connection is provided via a configuration file and the user has read permission on a LabKey server with the demographic data in an EHR (Electronic Health Record) module. See
Parents with ages below a user selected threshold are identified. A minimum parent age in years is set by the user and is used to ensure each parent is at least that age on the birth date of an offspring. The minimum parent age defaults to 2 years. This check is not performed for animals with missing birth dates.
Creation of Pedigree From a List of Potential Breeders and LabKey Integration
The user can enter a list of focal animals in a CSV file that will be used to create a pedigree containing all direct relative (ancestors and descendants) via the labkey.selectRows function within the Rlabkey package if a database connection is provided via a configuration file and the user has read permission on a LabKey server with the demographic data in an EHR (Electronic Health Record) module.
Two configuration files are needed to use the database features of nprcgenekeepr with LabKey. The first file is named _netrc on Microsoft Windows operating systems and .netrc otherwise, allows the user to authenticate with LabKey through the LabKey API and is fully described by LabKey documentation
The second file is named _nprcgenekeepr_config on Microsoft
Windows operating systems and .nprcgenekeepr_config otherwise and
is the nprcgenekeepr configuration
file
An image of this example configuration file is included as a data object
and can be loaded and viewed with the following lines of R code in the R
console.
data("exampleNprcgenekeeprConfig")
View(exampleNprcgenekeeprConfig)
Display of an age by sex pyramid plot
Adapted from on 20190603. Written by Matt Rosenberg. Updated May 07, 2019
The most important demographic characteristic of a population is its age-sex structure. Age-sex pyramids (also known as population pyramids) graphically display this information to improve understanding and make comparison easy. The population pyramid sometimes has a distinctive pyramid-like shape when displaying a growing population.
How to Read the Age-Sex Graph
An age-sex pyramid breaks down a population into male and female genders and age ranges. Usually, you’ll find the left side of the pyramid graphing the male population and the right side of the pyramid displaying the female population.
Along the horizontal axis (x-axis) of a population pyramid, the graph displays the population either as a total population of that age or as a percentage of the population at that age. The center of the pyramid starts at zero population and extends out to the left for males and right for females in increasing size, or proportion of the population.
Along the vertical axis (y-axis), age-sex pyramids display two-year age increments, from birth at the bottom to old age at the top.
Genetic Value Analysis Reports
The Genetic Value Analysis is a ranking scheme developed at ONPRC to indicate the relative breeding value of animals in the colony. The scheme uses the mean kinship for each animal to indicate how inter-related it is with the rest of the current breeding colony members. Genome uniqueness is used to provide an indication of whether or not an animal is likely to possess alleles at risk of being lost from the colony. Under the scheme, animals with low mean kinship or high genome uniqueness are ranked more highly.
Breeding Group Formation
One of the goals in breeding group formation is to avoid the potential for mating of closely related animals. Since behavioral concerns and housing constraints will also be taken into account in the group formation process, it is our goal to provide the largest number of animals possible from a list of candidates that can be housed together without risk of consanguineous mating. To that end, this function uses information from the Genetic Value Analysis to search for the largest combinations of animals that can be produced from a list of candidates.
The default options do not consider the sex of individuals when forming the groups, though this has likely been a consideration by the user in selecting the candidate group members. Optionally the user may select to form harem groups, which considers the sex of individuals when forming groups and restricts the number of males to one per group.
For more information see:
A Practical Approach for Designing Breeding Groups to Maximize Genetic
Diversity in a Large Colony of Captive Rhesus Macaques (Macaca
mulatto) Vinson, A ; Raboin, MJ Journal Of The American Association
For Laboratory Animal Science, 2015 Nov, Vol.54(6), pp.700-707 [Peer
Reviewed Journal]
Functions in nprcgenekeepr
| Name | Description | |
| addIdRecords | addIdRecords Adds Ego records added having NAs for parent IDs | |
| addSexAndAgeToGroup | Forms a dataframe with Id, Sex, and current Age given a list of Ids and a pedigree | |
| addParents | Add parents | |
| calcFG | Calculates Founder Genome Equivalents | |
| calcFEFG | Calculates Founder Equivalents and Founder Genome Equivalents | |
| calcGU | Calculates genome uniqueness for each ID that is part of the population. | |
| calcRetention | Calculates Allelic Retention | |
| alleleFreq | Calculates the count of each allele in the provided vector. | |
| allTrueNoNA | Returns TRUE if every member of the vector is TRUE. | |
| calcAge | Calculate animal ages. | |
| calcA | Calculates a, the number of an individual's alleles that are rare in each simulation. | |
| checkChangedColsLst | checkChangedColsLst examines list for non-empty fields | |
| assignAlleles | Assign parent alleles randomly | |
| checkRequiredCols | Examines column names, cols for required column names | |
| calcFE | Calculates founder Equivalents | |
| checkErrorLst | checkErrorLst examines list for non-empty fields | |
| calculateSexRatio | Calculates the sex ratio (number of non-males / number of males) given animal Ids and their pedigree | |
| checkChangedColAndErrorLst | checkChangedColAndErrorLst examines errorLst for errors and errorLst$changeCols non-empty fields | |
| countLoops | Count the number of loops in a pedigree tree. | |
| chooseDate | Choose date based on earlier flag. | |
| chooseAllelesChar | Combines two vectors of alleles when alleles are character vectors. | |
| convertSexCodes | Converts sex indicator for an individual to a standardized codes. | |
| createPedOne | createPedOne makes the pedOne data object | |
| convertAncestry | Converts the ancestry information to a standardized code | |
| colChange | colChange internal function to describe column names transformation | |
| chooseAlleles | Combines two vectors of alleles by randomly selecting one allele or the other at each position. | |
| checkGenotypeFile | Check genotype file | |
| correctParentSex | Sets sex for animals listed as either a sire or dam. | |
| examplePedigree | examplePedigree is a pedigree object created by qcStudbook | |
| checkParentAge | Check parent ages to be at least minParentAge | |
| convertStatusCodes | Converts status indicators to a Standardized code | |
| fillBins | fillBins Fill bins represented by list of two lists males and females. | |
| convertDate | Converts date columns formatted as characters to be of type datetime | |
| createPedTree | Create a pedigree tree (PedTree). | |
| countFirstOrder | Count first-order relatives. | |
| fillGroupMembers | Forms and fills list of animals groups based on provided constraints | |
| findGeneration | Determines the generation number for each id. | |
| filterAge | Removes kinship values where an animal is less than the minAge | |
| getAncestors | Recursively create a character vector of ancestors for an individual ID. | |
| filterKinMatrix | Filters a kinship matrix to include only the egos listed in 'ids' | |
| create_wkbk | Creates an Excel workbook with worksheets. | |
| fillGroupMembersWithSexRatio | Forms breeding group(s) with an effort to match a specified sex ratio | |
| dataframe2string | dataframe2string converts a data.frame object to a character vector | |
| filterPairs | Filters kinship values from a long-format kinship table based on the sexes of the two animals involved. | |
| exampleNprcgenekeeprConfig | exampleNprcgenekeeprConfig is a loadable version of the example configuration file example_nprcgenekeepr_config | |
| findLoops | Find loops in a pedigree tree | |
| filterThreshold | Filters kinship to remove rows with kinship values less than the specified threshold | |
| getCurrentAge | Age in years using the provided birthdate. | |
| createPedSix | createPedSix makes the pedSix data object | |
| getDateColNames | Vector of date column names | |
| finalRpt | finalRpt is a list object created from the list object rpt prepared by reportGV. It is created inside orderReport. This version is at the state just prior to calling rankSubjects inside orderReport. | |
| getAnimalsWithHighKinship | Forms a list of animal Ids and animals related to them | |
| getDateErrorsAndConvertDatesInPed | Converts columns of dates in text form to Date object columns | |
| filterReport | Filters a genetic value report down to only the specified animals | |
| getChangedColsTab | getChangedColsTab skeleton of list of errors | |
| getDemographics | Get demographic data | |
| getDatedFilename | Returns a character vector with an file name having the date prepended. | |
| getMaxAx | Get the maximum of the absolute values of the negative (males) and positive (female) animal counts. | |
| get_elapsed_time_str | Returns the elapsed time since start_time. | |
| getMinParentAge | Get minimum parent age. | |
| getIncludeColumns | Get the superset of columns that can be in a pedigree file. | |
| getConfigFileName | getConfigFileName returns the configuration file name appropriate for the system. | |
| insertChangedColsTab | insertChangedColsTab insert a list of changed columns found by qcStudbook in the pedigree file | |
| getEmptyErrorLst | Creates a empty errorLst object | |
| get_and_or_list | Returns a one element character string with correct punctuation for a list made up of the elements of the character vector argument. | |
| getIdsWithOneParent | getIdsWithOneParent extracts IDs of animals pedigree without either a sire or a dam | |
| insertErrorTab | insertErrorTab insert a list of errors found by qcStudbook in the pedigree file | |
| getPotentialSires | Provides list of potential sires | |
| getPyramidPlot | Creates a pyramid plot of the pedigree provided. | |
| getPyramidAgeDist | Get the age distribution for the pedigree | |
| fixColumnNames | fixColumnNames changes original column names and into standardized names. | |
| hasBothParents | hasBothParents checks to see if both parents are identified. | |
| getRecordStatusIndex | Returns record numbers with selected recordStatus. | |
| getRequiredCols | Get required column names for a studbook. | |
| makeAvailable | Convenience function to make the initial available animal list | |
| lacy1989PedAlleles | lacy1989PedAlleles is a dataframe produced by geneDrop on lacy1989Ped with 5000 iterations. | |
| convertRelationships | Converts pairwise kinship values to a relationship category descriptor. | |
| findOffspring | Finds the number of total offspring for each animal in the provided pedigree. | |
| getGVGenotype | Get Genetic Value Genotype data structure for reportGV function. | |
| findPedigreeNumber | Determines the generation number for each id. | |
| getProbandPedigree | Gets pedigree to ancestors of provided group leaving uninformative ancestors. | |
| fixGenotypeCols | Reformat names of observed genotype columns | |
| hasGenotype | Check for genotype data in dataframe | |
| obfuscateDate | obfucateDate adds a random number of days bounded by plus and minus max delta | |
| getOffspring | Get offspring to corresponding animal IDs provided | |
| getGVPopulation | Get the population of interest for the Genetic Value analysis. | |
| getGenoDefinedParentGenotypes | Assigns parental genotype contributions to an IDs genotype by attributing alleles to sire or dam | |
| getProportionLow | Get proportion of Low genetic value animals | |
| getParamDef | Get parameter definitions from tokens found in configuration file. | |
| getProductionStatus | Get production status of group | |
| groupAddAssign | Add animals to an existing breeding group or forms groups: | |
| obfuscateId | obfucateId creates a vector of ID aliases of specified length | |
| pedDuplicateIds | pedDuplicateIds is a dataframe with 9 rows and 5 columns (ego_id, sire, dam_id, sex, birth_date) representing a full pedigree with a duplicated record. | |
| getGenotypes | Get genotypes from file | |
| groupMembersReturn | Forms return list of groupAddAssign function | |
| getLogo | Get Logo file name | |
| getLkDirectRelatives | Get the direct ancestors of selected animals | |
| getParents | Get parents to corresponding animal IDs provided | |
| getPedMaxAge | Get the maximum age of live animals in the pedigree. | |
| pedFemaleSireMaleDam | pedFemaleSireMaleDam is a dataframe with 8 rows and 5 columns (ego_id, sire, dam_id, sex, birth_date) representing a full pedigree with the errors of having a sire labeled as female and a dam labeled as male. | |
| getVersion | getVersion Get the version number of nprcgenekeepr | |
| getTokenList | Gets tokens from character vector of lines | |
| is_valid_date_str | Returns TRUE if the string is a valid date. | |
| print.summary.nprcgenekeeprErr | print.summary.nprcgenekeepr print.summary.nprcgenekeeprGV | |
| getSiteInfo | Get site information | |
| getSexRatioWithAdditions | getSexRatioWithAdditions returns the sex ratio of a group. | |
| kinMatrix2LongForm | Reformats a kinship matrix into a long-format table. | |
| insertSeparators | insertSeparators inserts the character "-" between year and month and between month and day portions of a date string in %Y%m%d format. | |
| qcBreeders | qcBreeders is a list of 29 baboon IDs that are potential breeders | |
| removeUnknownAnimals | removeUnknownAnimals Removes unknown animals added to pedigree that serve as placeholders for unknown parents. | |
| reportGV | Generates a genetic value report for a provided pedigree. | |
| makeCEPH | Make a CEPH-style pedigree for each id | |
| makeGrpNum | Convenience function to make the initial grpNum list | |
| makeGroupMembers | Convenience function to make the initial groupMembers animal list | |
| isEmpty | Is vector empty or all NA values. | |
| makesLoop | makesLoop tests for a common ancestor. | |
| mapIdsToObfuscated | Map IDs to Obfuscated IDs | |
| headerDisplayNames | Convert internal column names to display or header names. | |
| focalAnimals | focalAnimals is a dataframe with one column (_id_) containing the of animal Ids from the __examplePedigree__ pedigree. | |
| initializeHaremGroups | Make the initial groupMembers animal list | |
| geneDrop | Gene drop simulation based on the provided pedigree information | |
| makeExamplePedigreeFile | Write copy of nprcgenekeepr::examplePedigree into a file | |
| pedGood | pedGood is a dataframe with 8 rows and 5 columns (ego_id, sire, dam_id, sex, birth_date) representing a full pedigree with no errors. | |
| getFocalAnimalPed | Get pedigree based on list of focal animals | |
| getErrorTab | getErrorTab skeleton of list of errors | |
| getIndianOriginStatus | Get Indian-origin status of group | |
| removeDuplicates | Remove duplicate records from pedigree | |
| pedInvalidDates | pedInvalidDates is a dataframe with 8 rows and 5 columns (ego_id, sire, dam_id, sex, birth_date) representing a full pedigree with values in the birth_date column that are not valid dates. | |
| getPedigree | Get pedigree from file | |
| getLkDirectAncestors | Get the direct ancestors of selected animals | |
| removeEarlyDates | removeEarlyDates removes dates before a specified year | |
| getPossibleCols | Get possible column names for a studbook. | |
| removeUninformativeFounders | Remove uninformative founders. | |
| saveDataframesAsFiles | Write copy of dataframes to either CSV or Excel file. | |
| setExit | Sets the exit date, if there is no exit column in the table | |
| removeSelectedAnimalFromAvailableAnimals | Updates list of available animals by removing the selected animal | |
| orderReport | Order the results of the genetic value analysis for use in a report. | |
| ped1Alleles | ped1Alleles is a dataframe created by the geneDrop function | |
| makeRelationClassesTable | Make relation classes table from kin dataframe. | |
| qcPed | qcPed is a dataframe with 277 rows and 6 columns | |
| meanKinship | Calculates the mean kinship for each animal in a kinship matrix | |
| set_seed | Work around for unit tests using sample() among various versions of R | |
| setPopulation | Population designation function | |
| runGeneKeepR | Allows running shiny application with nprcgenekeepr::runGeneKeepR() | |
| nprcgenekeepr | Genetic Management Functions | |
| makeRoundUp | Round up the provided integer vector int according to the modulus. | |
| rhesusPedigree | rhesusPedigree is a pedigree object | |
| qcPedGvReport | qcPedGvReport is a genetic value report | |
| pedWithGenotype | pedWithGenotype is a dataframe produced from qcPed by adding made up genotypes. | |
| unknown2NA | Removing IDs having "UNKNOWN" regardless of case | |
| pedWithGenotypeReport | pedWithGenotypeReport is a list containing the output of reportGV. | |
| obfuscatePed | obfuscatePed takes a pedigree object and creates aliases for all IDs and adjusts all date within a specified amount. | |
| kinship | Generates a kinship matrix. | |
| lacy1989Ped | lacy1989Ped small hypothetical pedigree | |
| offspringCounts | Finds the total number of offspring for each animal in the pedigree | |
| withinIntegerRange | Get integer within a range | |
| pedMissingBirth | pedMissingBirth is a dataframe with 8 rows and 5 columns (ego_id, sire, dam_id, sex, birth_date) representing a full pedigree with no errors. | |
| resetGroup | Update or add the "group" field of a Pedigree. | |
| pedOne | pedOne is a loadable version of a pedigree file fragment used for testing and demonstration | |
| qcStudbook | Quality Control for the Studbook or pedigree | |
| pedSameMaleIsSireAndDam | pedSameMaleIsSireAndDam is a dataframe with 8 rows and 5 columns (ego_id, sire, dam_id, sex, birth_date) representing a full pedigree with no errors. | |
| toCharacter | Force dataframe columns to character | |
| trimPedigree | Trim pedigree to ancestors of provided group by removing uninformative individuals | |
| rhesusGenotypes | rhesusGenotypes is a dataframe with two haplotypes per animal | |
| rbindFill | Append the rows of one dataframe to another. | |
| removeGroupIfNoAvailableAnimals | Remove group numbers when all available animals have been used | |
| rankSubjects | Ranks animals based on genetic value. | |
| removePotentialSires | Removes potential sires from list of Ids | |
| pedSix | pedSix is a loadable version of a pedigree file fragment used for testing and demonstration | |
| readExcelPOSIXToCharacter | Read in Excel file and convert POSIX dates to character | |
| smallPed | smallPed is a hypothetical pedigree | |
| smallPedTree | smallPedTree is a pedigree tree made from smallPed | |
| str_detect_fixed_all | Returns a logical vector with results of stri_detect() for each pattern in second parameters character vector. | |
| summary.nprcgenekeeprErr | summary.nprcgenekeeprErr Summary function for class nprcgenekeeprErr | |
| addErrTxt | Concatenates any errors from nprcgenekeeprErr into narrative form | |
| agePyramidPlot | Form age pyramid plot | |
| addGroupOfUnusedAnimals | addGroupOfUnusedAnimals adds a group to the saved groups if needed | |
| addBackSecondParents | Add back single parents trimmed pedigree | |
| addAnimalsWithNoRelative | Adds an NA value for all animals without a relative | |
| addUIds | Eliminates partial parentage situations by adding unique placeholder IDs for the unknown parent. | |
| addGenotype | Add genotype data to pedigree file | |
| createExampleFiles | Creates a folder with CSV files containing example pedigrees and ID lists used to demonstrate the package. | |
| No Results! | ||
Vignettes of nprcgenekeepr
Last month downloads
Details
| Type | Package |
| URL | https://rmsharp.github.io/nprcgenekeepr/, https://github.com/rmsharp/nprcgenekeepr |
| BugReports | https://github.com/rmsharp/nprcgenekeepr/issues |
| Language | en-US |
| Encoding | UTF-8 |
| License | MIT + file LICENSE |
| RoxygenNote | 7.1.0 |
| LazyData | TRUE |
| VignetteBuilder | knitr, rmarkdown |
| NeedsCompilation | no |
| Packaged | 2020-05-27 02:15:14 UTC; msharp |
| Repository | CRAN |
| Date/Publication | 2020-06-02 12:40:03 UTC |
| imports | anytime , futile.logger , htmlTable , lubridate , Matrix , plotrix , readxl , Rlabkey , shiny , shinyBS , stringi , utils , WriteXLS |
| suggests | covr , dplyr , ggplot2 , grid , kableExtra , knitr , pkgdown , png , rmarkdown , roxygen2 (>= 7.0.0) , testthat |
| depends | R (>= 3.6.0) |
| Contributors | Terry Therneau, Michael Raboin, Amanda Vinson, Southwest National Primate Research Center NIH grant P51 RR13986 , Oregon National Primate Research Center grant P51 OD011092 |
Include our badge in your README
[](http://www.rdocumentation.org/packages/nprcgenekeepr)