reference_allele_counts: Tabulate occurrences of all observed alleles in reference genetic data
Description
Takes the first output of tcf2long, along with two columns named "collection" and "sample_type",
and returns a data frame of allele counts for each locus within each reference population.
Alleles to be counted are identified from both reference and mixture populations.
reference_allele_counts returns a long-format dataframe, with count data for
each collection, locus, and allele. Counts are only drawn from "reference" samples; alleles
unique to the "mixture" samples will still appear in the list, but will have 0s for all groups.
Arguments
D
A data frame containing, at minimum, a column of sample group identifiers named
"collection", a column designating each row as "reference" or "mixture", named "sample_type",
and (from tcf2long output) locus, gene copy, and observed alleles. If higher-level reporting
unit counts are desired, must have a column of reporting unit identifiers named "repunit"
pop_level
a character vector expressing the population level for which allele counts
should be tabulated. Set to "collection" for collection/underlying sample group (default),
or "repunit" for reporting unit/overlying sample groups
Details
The "collection" column should be a key assigning samples to the desired groups,
e.g. collection site, run time, year.
The "sample_type" column must contain either "reference" or "mixture" for each sample.