Only one of alleleMismatch
, cutHeight
, matchThreshold
can be given, as the three
parameters are related.
alleleMismatch
is the most intuitive way to understand how the identification
of unique genotypes proceeds. For example, a setting of alleleMismatch = 4
implies that up to four alleles may be different for
multiple samples to be representatives of the same individual. In practice, however, this value is only an approximation of the amount of mismatch that may be tolerated.
This is because the clustering process used to identify unique genotypes, and the subsequent matching which identifies samples that match these unique genotypes is based on a dissimilarity
metric or score (see amMatrix
) that incorporates both allele mismatches and missing data. alleleMismatch
is not used
in analyses and is converted to this dissimilarity metric in the following manner: cutHeight
which is parameter for amCluster
and called from
this function is cutHeight = alleleMismatch/(number of allele columns)
and matchThreshold
which is a parameter for amPairwise
and also called from
this function is matchThreshold = 1 - cutHeight
.
Selecting the appropriate value for alleleMismatch
, cutHeight
, or matchThreshold
is an important task. Use
amUniqueProfile
to assist in this process. Please see supplementary documentation for more information
doPsib = "missing"
is the default and specifies that match probability Psib should be calculated
for samples that match unique genotypes and have no allele mismatches, but may differ by having missing data. doPsib = "all"
specifies
that Psib should be calculated for all samples that match unique genotypes. In this case, if allele mismatches occur, alleles are
assumed to be missing at the mismatching loci.
multilocusMap
is often not required, as amDataset objects will typically consist of paired columns of genotypes, where each
pair is a separate locus. In cases where this is not the case (e.g. gender is given in only one column),
a map vector must be specified.
Example: amDataset consists of gender followed by 4 diploid loci in paired columns
multilocusMap=c(1,2,2,3,3,4,4,5,5)
or equally
multilocusMap=c("GENDER", "LOC1", "LOC1", "LOC2", "LOC2", "LOC3", "LOC4", "LOC4")
For more information on selecting consensusMethod
please see amCluster
. The default consensusMethod=1
is typically adequate.