Learn R Programming

ade4 (version 1.2-2)

genet: A class of data: tables of populations and alleles

Description

There are multiple formats of genetic data. The functions of ade4 associated genetic data use the class genet. An object of the class genet is a list containing at least one data frame whose lines are groups of individuals (populations) and columns alleles forming blocks associated with the locus. They contain allelic frequencies expressed as a percentage. The function char2genet ensures the reading of tables crossing diploid individuals arranged by groups (populations) and polymorphic loci. Data frames containing only strings of characters are transformed in tables of allelic frequencies of the class genet. In entry a row is an individual, a variable is a locus and a value is a string of characters, for example ' 012028 ' for a heterozygote carrying alleles 012 and 028, ' 020020 ' for a homozygote carrying two alleles 020 and ' 000000 ' for a not classified locus (missing data). The function count2genet reads data frames containing allelic countings by populations and allelic forms classified by locus. The function freq2genet reads data frames containing allelic frequencies by populations and allelic forms classified by locus. In these two cases, use as names of variables of strings of characters xx.yyy where xx are the names of locus and yyy a name of allelic forms in this locus. The analyses on this kind of data having to use compact labels, these functions classify the names of the populations, the names of the loci and the names of the allelic forms in vectors and re-code in a simple way starting with P for population, L for locus and 1,..., m for the alleles.

Usage

char2genet(X, pop, complete)
count2genet(PopAllCount)
freq2genet(PopAllFreq)

Arguments

X
a data frame of strings of characters (individuals in row, locus in variables), the value coded '000000' or two alleles of 6 characters
pop
a factor with the same number of rows than df classifying the individuals by population
complete
a logical value indicating a complete issue or not, by default FALSE
PopAllCount
a data frame containing integers: the occurrences of each allelic form (column) in each population (row)
PopAllFreq
a data frame containing values between 0 and 1: the frequencies of each allelic form (column) in each population (row)

Value

  • char2genet returns a list of class genet with :
  • $taba frequencies table of poplations (row) and alleles (column)
  • $centerthe global frequency of each allelic form calculated on the overall individuals classified on each locus
  • $pop.namesa vector containing the names of populations present in the data re-coded P01, P02, ...

item

  • $all.names
  • $loc.blocks
  • $loc.fac
  • $loc.names
  • $pop.loc
  • $comp
  • $comp.pop

dots

, L99

code

complete

Details

As a lot of formats for genetic data are puublished in literature, a list of class genet contains at least a table of allellic frequencies and an attribut loc.blocks. The populations (row) and the variables (column) are classified by alphabetic order. In the component comp, each individual per locus of m alleles is re-coded by a vector of length m: for hererozygicy 0,...,1,...,1,...,0 and homozygocy 0,...,2,0.

Examples

Run this code
data(casitas)
casitas[24,]
casitas.pop <- as.factor(rep(c("dome", "cast", "musc", "casi"), c(24,11,9,30)))
casi.genet <- char2genet(casitas, casitas.pop, complete=TRUE)
names(casi.genet$tab) 
casi.genet$tab[,1:8] 
casi.genet$pop.names
casi.genet$loc.names
casi.genet$all.names
casi.genet$loc.blocks # number of allelic forms by loci
casi.genet$loc.fac # factor classifying the allelic forms by locus
casi.genet$pop.loc # table populations loci
names(casi.genet$comp)
casi.genet$comp[1:4,]
casi.genet$comp.pop
casi.genet$center
apply(casi.genet$tab,2,mean)
casi.genet$pop.loc[,"L15"]
casi.genet$tab[, c("L15.1","L15.2")]
class(casi.genet)
casitas.coa <- dudi.coa(casi.genet$comp, scannf = FALSE)
s.class(casitas.coa$li,casi.genet$comp.pop)

Run the code above in your browser using DataLab