Function to generate an overview of genotype probabilities across a population
gp_overview(probgeno_df, cutoff = 0.7, alpha = 0.1)
A data frame as read from the scores file produced by function
saveMarkerModels
of R package fitPoly
, or equivalently, a data frame containing the following columns:
SampleName Name of the sample (individual)
MarkerName Name of the marker
P0 Probabilities of dosage score '0'
P1... Probabilities of dosage score '1' etc. (up to max offspring dosage, e.g. P4 for tetraploid population)
maxP Maximum genotype probability identified for a particular individual and marker combination
maxgeno Most probable dosage for a particular individual and marker combination
geno
Most probable dosage for a particular individual and marker combination, if maxP
exceeds a user-defined threshold (e.g. 0.9), otherwise NA
a filtering threshold, by default 0.7, to identify individuals with more than alpha
non-missing (maximum) genotype probabilities falling below this cut-off. In other words, by using this
default settings (cutoff
= 0.7 and alpha
= 0.1), you require that 90
in one of the possible genotype dosage classes. This can help identify problematic individuals with many examples of
diffuse genotype calls. Lowering the threshold allows more diffuse calls to be accepted.
Option to specify the quantile of an individuals' scores that will be used to test against cutoff
, by default 0.1.
a list with the following elements:
Input data, filtered based on chosen cutoff
data.frame containing summary statistics of each individual's genotyping scores
# NOT RUN {
data("gp_df")
gp_overview(gp_df)
# }
Run the code above in your browser using DataLab