Count opposite homozygous (OH) loci between parent-offspring pairs and Mendelian errors (ME) between parent-parent-offspring trios, and calculate the parental log-likelihood ratios (LLR).
CalcOHLLR(
Pedigree = NULL,
GenoM = NULL,
CalcLLR = TRUE,
LifeHistData = NULL,
AgePrior = FALSE,
Err = 1e-04,
ErrFlavour = "version2.0",
Tassign = 0.5,
Complex = "full",
GDX = TRUE,
quiet = FALSE
)
dataframe with columns id-dam-sire. May include non-genotyped individuals, which will be treated as dummy individuals.
the genotype matrix
calculate log-likelihood ratios for all assigned parents
(genotyped + dummy/non-genotyped; parent vs. otherwise related). If
FALSE
, only number of mismatching SNPs are counted (OH &
ME), and parameters LifeHistData
, AgePrior
, Err
,
Tassign
, and Complex
are ignored. Note also that
calculating likelihood ratios is much more time consuming than counting OH
& ME.
Dataframe with columns ID - Sex - BirthYear, and optionally columns BY.min and BY.max. If provided, used to delimit possible alternative relationships.
logical (TRUE/FALSE) to estimate the ageprior from Pedigree
and LifeHistData, or an agepriors matrix (see MakeAgePrior
).
Affects which alternative relationships are considered (only those where
\(P(A|R) / P(A) > 0\)). When TRUE, MakeAgePrior
is called
using its default values.
estimated genotyping error rate, as a single number or 3x3 matrix. If a matrix, this should be the probability of observed genotype (columns) conditional on actual genotype (rows). Each row must therefore sum to 1.
function that takes Err
as input, and returns a 3x3
matrix of observed (columns) conditional on actual (rows) genotypes, or
choose from inbuilt ones as used in sequoia 'version2.0', 'version1.3', or
'version1.1'. Ignored if Err
is a matrix. See ErrToM
.
used to determine whether or not to consider some more exotic relationships when Complex="full".
determines which relationships are considered as alternatives. Either "full" (default), "simp" (simplified, ignores inbred relationships), or "mono" (monogamous).
call getAssignCat
to classify individuals as
genotyped (G), substitutable by a dummy (D) or neither (X).
logical, suppress messages
the Pedigree
dataframe with additional columns:
Log10-Likelihood Ratio (LLR) of this female being the mother, versus the next most likely relationship between the focal individual and this female (see Details for relationships considered)
idem, for male parent
LLR for the parental pair, versus the next most likely configuration between the three individuals (with one or neither parent assigned)
Number of loci at which the offspring and mother are opposite homozygotes
idem, for father
Number of Mendelian errors between the offspring and the parent pair, includes OH as well as e.g. parents being opposing homozygotes, but the offspring not being a heterozygote. The offspring being OH with both parents is counted as 2 errors.
Number of SNPs scored (non-missing) for both individual and dam
Number of SNPs scored for both individual and sire
Character denoting whether the focal individual is genotyped (G), substitutable by a dummy (D), or neither (X).
as id.cat, for dams. If id.cat and/or dam.cat is 'X', the dam cannot be assigned.
as dam.cat, for sires
Sex in LifeHistData, or inferred Sex when assigned as part of parent-pair
mode of birth year probability distribution
lower limit of 95% highest density region of birth year probability distribution
higher limit
The columns 'LLRdam', 'LLRsire' and 'LLRpair' are only included when CalcLLR=TRUE. The columns 'dam.cat' and 'sire.cat' are only included when GDX=TRUE. The columns 'Sexx', 'BY.est', 'BY.lo' and 'BY.hi' are only included when LifeHistData is provided, and at least one genotyped individual has an unknown birthyear or unknown sex.
Any individuals in Pedigree
that do not occur in GenoM
are substituted by dummy individuals; a value of '0' in column
'SNPd.id.dam' in the output means that either the focal individual or the
dam was thus substituted, or both were. Use getAssignCat
to
distinguish between these cases.
The birth years in LifeHistData
and the AgePrior
are not used
in the calculation and do not affect the value of the likelihoods for the
various relationships, but they _are_ used during some filtering steps, and
may therefore affect the likelihood _ratio_. The default
(AgePrior=FALSE
) assumes all age-relationship combinations are
possible, which may mean that some additional alternatives are considered
compared to the sequoia
default, resulting in somewhat lower
LLR
values.
A negative LLR for A's parent B indicates either that B is not truely the parent of A, or that B's parents are incorrect. The latter may cause B's presumed true, unobserved genotype to greatly divert from its observed genotype, with downstream consequences for its offspring. In rare cases it may also be due to 'weird', non-implemented double or triple relationships between A and B.
SummarySeq
for visualisation of OH & LLR
distributions; GenoConvert
to read in various genotype data
formats, CheckGeno
; PedPolish
to check and
'polish' the pedigree; getAssignCat
to find which id-parent
pairs are both genotyped or can be substituted by dummy individuals;
sequoia
for pedigree reconstruction
# NOT RUN {
# have a quick look for errors in an existing pedigree,
# without running pedigree reconstruction
PedA <- CalcOHLLR(Pedigree = MyOldPedigree, GenoM = MyNewGenotypes,
CalcLLR=FALSE)
# or run sequoia with CalcLLR=FALSE, and add OH + LLR later
SeqOUT <- sequoia(Genotypes, LifeHist, CalcLLR=FALSE)
PedA <- CalcOHLLR(Pedigree = SeqoUT$Pedigree[, 1:3], GenoM = Genotypes,
LifeHistData = LIfeHist, AgePrior = TRUE, Complex = "full")
# visualise
SummarySeq(PedA, Panels=c("LLR", "OH"))
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab