Learn R Programming

polysat (version 0.1)

calcFst: Calculate Wright's Pairwise FST

Description

Given a data frame of allele frequencies and population sizes, calcFst calculates a matrix of pairwise Fst values.

Usage

calcFst(freqs, pops = row.names(freqs), loci = unique(as.matrix(
as.data.frame(strsplit(names(freqs), split = ".", fixed = TRUE),
stringsAsFactors = FALSE))[1, ]))

Arguments

freqs
A data frame of allele frequencies and population sizes such as that produced by estimate.freq. Each population is in one row, and a column called Genomes contains the relative size of each population. All other columns cont
pops
A character vector. Populations to analyze, which should be a subset of row.names(freqs).
loci
A character vector indicating which loci to analyze. These should be a subset of the locus names as used in the column names of freqs.

Value

  • A square matrix containing FST values. The rows and columns of the matrix are both named by population.

Details

calcFst works by calculating HS and HT for each locus for each pair of populations, then averaging HS and HT across loci. FST is then calculated for each pair of populations as (HT-HS)/HT. H values (expected heterozygosities for populations and combined populations) are calculated as one minus the sum of all squared allele frequencies at a locus. To calculte HT, allele frequencies between two populations are averaged before the calculation. To calculate HS, H values are averaged after the calculation. In both cases, the averages are weighted by the relative sizes of the two populations (as indicated by freqs$Genomes).

References

Nei, M. (1973) Analysis of Gene Diversity in Subdivided Populations. Proceedings of the National Academy of Sciences of the United States of America 70, 3321--3323.

See Also

estimate.freq

Examples

Run this code
# create a data set (typically done by reading files)
mygenotypes <- array(list(-9), dim = c(6,2), dimnames =
list(paste("ind",1:6, sep=""),c("loc1","loc2")))
mygenotypes[,"loc1"] <- list(c(206), c(208,210), c(204,206,210),
c(196,198,202,208), c(196,200), c(198,200,202,204))
mygenotypes[,"loc2"] <- list(c(130,134), c(138,140), c(130,136,140),
c(138), c(136,140), c(130,132,136))

mypopinfo <- c(1,1,1,2,2,2)
names(mypopinfo) <- dimnames(mygenotypes)[[1]]

myploidies <- c(2,2,4,4,2,4)
names(myploidies) <- dimnames(mygenotypes)[[1]]

# calculate allele frequencies
myfreq <- estimate.freq(mygenotypes, popinfo = mypopinfo,
indploidies = myploidies)

# calculate pairwise FST
myfst <- calcFst(myfreq)

# examine the results
myfst

Run the code above in your browser using DataLab