poppr: Produce a basic summary table for population genetic analyses.

Description

This function allows the user to quickly view indicies of heterozygosity, evenness, and inbreeding to aid in the decision of a path to further analyze a specified dataset. It natively takes genind and genclone objects, but can convert any raw data formats that adegenet can take (fstat, structure, genetix, and genpop) as well as genalex files exported into a csv format (see read.genalex for details).

Usage

poppr(dat, total = TRUE, sublist = "ALL", blacklist = NULL, sample = 0,
  method = 1, missing = "ignore", cutoff = 0.05, quiet = FALSE,
  clonecorrect = FALSE, hier = 1, dfname = "population_hierarchy",
  keep = 1, hist = TRUE, minsamp = 10, legend = FALSE)

Arguments

dat

a genind object OR a genclone object OR any fstat, structure, genetix, genpop, or genalex formatted file.

total

When TRUE (default), indices will be calculated for the pooled populations.

sublist

a list of character strings or integers to indicate specific population names (located in $pop.names within the genind object) Defaults to "ALL".

blacklist

a list of character strings or integers to indicate specific populations to be removed from analysis. Defaults to NULL.

sample

an integer indicating the number of permutations desired to obtain p-values. Sampling will shuffle genotypes at each locus to simulate a panmictic population using the observed genotypes. Calculating the p-value includes the observed statistics, so

method

an integer from 1 to 4 indicating the method of sampling desired. see shufflepop for details.

missing

how should missing data be treated? "zero" and "mean" will set the missing values to those documented in na.replace. "loci" and "geno" will remove

cutoff

numeric a number from 0 to 1 indicating the percent missing data allowed for analysis. This is to be used in conjunction with the flag missing (see missingno for details)

quiet

FALSE (default) will display a progress bar for each population analyzed.

clonecorrect

default FALSE. must be used with the hier and dfname parameters, or the user will potentially get undesired results. see clonecorrect for details.

hier

for genclone objects- aformulaindicating the hierarchical levels to be used. The hierarchies should be present in thehierarchyslot. Seesethierarchyf

dfname

a character string. (Only for genind objects) This is the name of the data frame or heirarchy containing the vectors of the population hierarchy within the other slot of the genind

keep

an integer. This indicates the levels of the population
  hierarchy you wish to keep after clone correcting your data sets. To
  combine the hierarchy, just set keep from 1 to the length of your
  hierarchy. see

hist

logical if TRUE (default) and sampling > 0,
  a histogram will be produced for each population.

minsamp

an integer indicating the minimum number of individuals
  to resample for rarefaction analysis. See rarefy for
  details.

legend

logical. When this is set to TRUE, a legend
  describing the resulting table columns will be printed. Defaults to
  FALSE

`Value`

PopA vector indicating the pouplation factor
NAn
  integer vector indicating the number of individuals/isolates in the
  specified population.
MLGAn integer vector indicating the number
  of multilocus genotypes found in the specified poupulation, (see:
  mlg)
eMLGThe expected number of MLG at the lowest
  common sample size (set by the parameter minsamp.
SEThe
  standard error for the rarefaction analysis
HShannon-Weiner
  Diversity index
GStoddard and Taylor's Index
HexpExpected
  heterozygosity or Nei's 1987 genotypic diversity corrected for sample
  size.
E.5Evenness
IaA numeric vector giving the value of
  the Index of Association for each population factor, (see
  ia).
p.IaA numeric vector indicating the p-value for
  Ia from the number of reshufflings indicated in sample. Lowest value
  is 1/n where n is the number of observed values.
rbarDA numeric
  vector giving the value of the Standardized Index of Association for each
  population factor, (see ia).
p.rDA numeric vector
  indicating the p-value for rbarD from the number of reshuffles indicated
  in sample. Lowest value is 1/n where n is the number of observed
  values.
FileA vector indicating the name of the original data
  file.

`References`

Paul-Michael Agapow and Austin Burt. Indices of multilocus
  linkage disequilibrium. Molecular Ecology Notes, 1(1-2):101-102,
  2001
  A.H.D. Brown, M.W. Feldman, and E. Nevo. Multilocus structure of natural
  populations of Hordeum spontaneum. Genetics, 96(2):523-536, 1980.
  Niklaus J. Gr"unwald, Stephen B. Goodwin, Michael G. Milgroom, and William
  E. Fry. Analysis of genotypic diversity data for populations of
  microorganisms. Phytopathology, 93(6):738-46, 2003
  Bernhard Haubold and Richard R. Hudson. Lian 3.0: detecting linkage
  disequilibrium in multilocus data. Bioinformatics, 16(9):847-849, 2000.
  Kenneth L.Jr. Heck, Gerald van Belle, and Daniel Simberloff. Explicit
  calculation of the rarefaction diversity measurement and the determination
  of sufficient sample size. Ecology, 56(6):pp. 1459-1461, 1975
  S H Hurlbert. The nonconcept of species diversity: a critique and
  alternative parameters. Ecology, 52(4):577-586, 1971.
  J.A. Ludwig and J.F. Reynolds. Statistical Ecology. A Primer on Methods and
  Computing. New York USA: John Wiley and Sons, 1988.
  Masatoshi Nei. Estimation of average heterozygosity and genetic distance
  from a small number of individuals. Genetics, 89(3):583-590, 1978.
  Jari Oksanen, F. Guillaume Blanchet, Roeland Kindt, Pierre Legendre, Peter
  R. Minchin, R. B. O'Hara, Gavin L. Simpson, Peter Solymos, M. Henry H.
  Stevens, and Helene Wagner. vegan: Community Ecology Package, 2012. R
  package version 2.0-5.
  E.C. Pielou. Ecological Diversity. Wiley, 1975.
  Claude Elwood Shannon. A mathematical theory of communication. Bell Systems
  Technical Journal, 27:379-423,623-656, 1948
  J M Smith, N H Smith, M O'Rourke, and B G Spratt. How clonal are bacteria?
  Proceedings of the National Academy of Sciences, 90(10):4384-4388, 1993.
  J.A. Stoddart and J.F. Taylor. Genotypic diversity: estimation and
  prediction in samples. Genetics, 118(4):705-11, 1988.

`See Also`

clonecorrect, poppr.all,
  ia, missingno, mlg

`Examples`

Run this codedata(nancycats)
poppr(nancycats)

poppr(nancycats, sample=99, total=FALSE, quiet=FALSE)

# Note: this is a larger data set that could take a couple of minutes to run
# on slower computers.
data(H3N2)
poppr(H3N2, total=FALSE, sublist=c("Austria", "China", "USA"),
				clonecorrect=TRUE, hier="country", dfname="x")
Run the code above in your browser using DataLab