Identify pairs of individuals likely to be related, but not assigned as such in the provided pedigree.
GetMaybeRel(
GenoM = NULL,
SeqList = NULL,
Pedigree = NULL,
LifeHistData = NULL,
AgePrior = NULL,
ParSib = NULL,
Module = "par",
Complex = "full",
Herm = "no",
Err = 1e-04,
ErrFlavour = "version2.0",
MaxMismatch = NA,
Tassign = 0.5,
Tfilter = -2,
MaxPairs = 7 * nrow(GenoM),
quiet = FALSE
)
numeric matrix with genotype data: One row per individual, and
one column per SNP, coded as 0, 1, 2 or -9 (missing). See also
GenoConvert
.
list with output from sequoia
.
SeqList$Pedigree
is used if present, and SeqList$PedigreePar
otherwise, and overrides the input parameter Pedigree
. If 'Specs' is
present, its elements override all input parameters with the same name. The
list elements `LifeHist', `AgePriors', and `ErrM' are also used if
present, and similarly override the corresponding input parameters.
dataframe with id - dam - sire in columns 1-3. May include
non-genotyped individuals, which will be treated as dummy individuals. When
provided, all likelihoods (and thus all maybe-relatives) are conditional on
this pedigree. Note: SeqList$Pedigree
or SeqList$PedigreePar
take
precedent (for this function only).
dataframe with 3 columns (optionally 5):
max. 30 characters long
1 = female, 2 = male, 3 = unknown, 4 = hermaphrodite, other numbers or NA = unknown
birth or hatching year, integer, with missing values as NA or any negative value.
minimum birth year, only used if BirthYear is missing
maximum birth year, only used if BirthYear is missing
If the species has multiple generations per year, use an integer coding such that the candidate parents' `Birth year' is at least one smaller than their putative offspring's. Column names are ignored, so ensure column order is ID - sex - birth year (- BY.min - BY.max). Individuals do not need to be in the same order as in `GenoM', nor do all genotyped individuals need to be included.
Agepriors matrix, as generated by MakeAgePrior
and included in the sequoia
output. Affects which
relationships are considered possible (only those where \(P(A|R) / P(A)
> 0\)).
either 'par' to check for putative parent-offspring pairs only,
or 'sib' to check for all types of first and second degree relatives. This
argument will be deprecated, please use Module
.
type of relatives to check for. One of
parent - offspring pairs
all first and second degree relatives
When 'par', all pairs are returned that are more likely parent-offspring than unrelated, potentially including pairs that are even more likely to be otherwise related.
Breeding system complexity. Either "full" (default), "simp" (simplified, no explicit consideration of inbred relationships), "mono" (monogamous).
Hermaphrodites, either "no", "A" (distinguish between dam and sire role, default if at least 1 individual with sex=4), or "B" (no distinction between dam and sire role). Both of the latter deal with selfing.
estimated genotyping error rate, as a single number or 3x3 matrix. Details below. The error rate is presumed constant across SNPs, and missingness is presumed random with respect to actual genotype.
function that takes Err
(single number) as input,
and returns a 3x3 matrix of observed (columns) conditional on actual (rows)
genotypes, or choose from inbuilt options 'version2.0', 'version1.3', or
'version1.1', referring to the sequoia version in which they were the
default. Ignored if Err
is a matrix. See ErrToM
.
DEPRECATED AND IGNORED. Now calculated
automatically using CalcMaxMismatch
.
minimum LLR required for acceptance of proposed relationship, relative to next most likely relationship. Higher values result in more conservative assignments. Must be zero or positive.
threshold log10-likelihood ratio (LLR) between a proposed relationship versus unrelated, to select candidate relatives. Typically a negative value, related to the fact that unconditional likelihoods are calculated during the filtering steps. More negative values may decrease non-assignment, but will increase computational time.
the maximum number of putative pairs to return.
logical, suppress messages.
A list with
A dataframe with non-assigned likely relatives, with columns ID1 - ID2 - TopRel - LLR - OH - BirthYear1 - BirthYear2 - AgeDif - Sex1 - Sex2 - SNPdBoth
A dataframe with non-assigned parent-parent-offspring trios, with columns id - parent1 - parent2 - LLRparent1 - LLRparent2 - LLRpair - OHparent1 - OHparent2 - MEpair - SNPd.id.parent1 - SNPd.id.parent2
Parent-Offspring
Full Siblings
Half Siblings
GrandParent - grand-offspring
Full Avuncular (aunt/uncle)
2nd degree relatives, not enough information to distinguish between HS,GP and FA
Unclear, but probably 1st, 2nd or 3rd degree relatives
When Module="par"
, the age difference of the putative pair is
temporarily set to NA so that genetic parent-offspring pairs declared to be
born in the same year may be discovered. When Module="ped"
, only
relationships possible given the age difference, if known from the
LifeHistData, are considered.
sequoia
to identify likely pairs of duplicate
genotypes and for pedigree reconstruction; GetRelM
to
identify all pairs of relatives in a pedigree; CalcPairLL
for
the likelihoods underlying the LLR.
# NOT RUN {
SeqOUT <- sequoia(GenoM = SimGeno_example,
LifeHistData = LH_HSg5,
Module = "par",
quiet=TRUE, Plot=FALSE)
MaybePO <- GetMaybeRel(GenoM = SimGeno_example,
SeqList = SeqOUT)
head(MaybePO$MaybePar)
# instead of providing the entire SeqList, one may
# specify the relevant elements separately
Maybe <- GetMaybeRel(GenoM = SimGeno_example,
Pedigree = SeqOUT$PedigreePar,
LifeHistData = LH_HSg5,
Err=0.0001, Complex = "full",
Module = "ped")
head(Maybe$MaybeRel)
# visualise results, turn dataframe into matrix first:
MaybeM <- GetRelM(Pairs=Maybe$MaybeRel)
PlotRelPairs(MaybeM)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab