Allows for estimating parameters of population genetics from genomic data. Besides, it also allows the estimate of same parameters considering subpopulations.
popgen(M, subgroups, plot = FALSE)
Object of class matrix
. A (non-empty) matrix of molecular markers, considering the count of reference alleles per loci (0, 1 or 2). Markers must be in columns and individuals in rows. Missing data should be assigned as NA
A vector
with information for subgroups or subpopulations.
If TRUE
, a graphical output is produced. See details
Two-level lists are returned (whole
and bygroup
), one with general information for markers and individuals and another by subgroups (if applicable).
For whole
, a list containing estimates parameters for
For each marker it presents the allelic frequency (\(p\) and \(q\)), Minor Allele Frequency (\(MAF\)), expected heterozygosity (\(H_e\)), observed heterozygosity (\(H_o\)), Nei's Genetic Diversity (\(DG\)) and Polymorphism Informative Content(\(PIC\)), proportion of missing (\(Miss\)), \(\chi^2\) statistic for the Hardy-Weinberg equilibrium test and its pvalue
It presents observed heterozygosity (\(H_o\)) and coefficient of inbreeding (\(F_i\)) estimated as excess of homozygous relative to the expected (Keller et al. (2011))
The same parameters as those for markers except PIC are returned for general population along with lower and upper boundaries
shows estimates of effective population size (\(Ne\)), additive (\(Va\)) and dominance (\(Vd\)) variances components, and a summary of number of groups, genotypes and markers
In the presence of subgroups, the same populational parameters are estimated considering each subpopulation accompanied by its exclusive and fixed alleles. Moreover, a list with the F-statistics (F_IT, F_IS and F_ST) for genotypes and markers are exhibited. For genotypes, it shows the statistics considering all subpopulations and a pairwise framework, and for markers loci, the parameters are presented only considering all subpopulations.
The plot produces a histogram for the estimates of MAF, GD, PIC and He considering the whole population and subpopulations, when it is available. Also, a heat map of the pairwise F_ST between populations is produced.
The number of subgroups is defined by the user and accepts any data type (character
, integer
...) to distinguish subpopulations.
These two inputs must have the same sort for rows (genotypes).
Weir, B.S. and C.C. Cockerham. (1984). Estimating F-Statistics for the Analysis of Population Structure. Evolution 38: 1358-1370. doi:10.2307/2408641.
Keller M.C., Visscher P.M., Goddard M.E. (2011) Quantification of inbreeding due to distant ancestors and its detection using dense single nucleotide polymorphism data. Genetics 189:237-249. doi: 10.1534/genetics.111.130922
# NOT RUN {
# hybrid maize data
data(maize.hyb)
x <- popgen(maize.hyb)
# using subpopulations
PS<-c(rep(1,25), rep(2,25))
x <- popgen(maize.hyb, subgroups=PS)
# }
Run the code above in your browser using DataLab