
Last chance! 50% off unlimited learning
Sale ends in
dominant.to.codominant(domdata, colinfo = NULL,
samples = dimnames(domdata)[[1]], missing = -9, allelepresent = 1, split
= ".")
allelepresent
indicates that a
domdata
, containing locus names as the first column and
allele numbers as the second column.domdata
are to be used.domdata
to indicate that a
particular sample has a particular allele.colinfo=NULL
, the character used to separate the locus
name and allele number in the column names of domdata
.dominant.to.codominant
is
written to convert that data back to a semi-codominant format so that
other analyses or data conversion can be performed.
The default symbol to indicate the presence of an allele is 1, but
this can be set to any other symbol using the allelepresent
argument. It does not matter which symbols are used to indicate that
an allele is absent or that there is missing data. If
dominant.to.codominant
does not find any alleles present for a
given sample and locus, it fills in a missing data symbol in that
position in the two-dimensional genotype list.
This function does not read or write files. Since the user would
already have dominant data in an array-like format in a spreadsheet or
text document, it should be easily read by read.table
and
converted to a matrix by as.matrix
.
There are two options for indicating which locus and allele is
represented by each column:
1) These can be specified in the second
dimension names of the array or matrix. The name of each column
should be a concatenation of the locus name followed by the allele
number, and these should be separated by a period or other character
as specified in split
(e.g. check.names=TRUE
, read.table
will convert a lot of symbols (like
hyphens or spaces) to periods. It is probably a good idea to inspect
the column names of domdata
before setting split
.
2) Create a data frame containing locus and allele information. The
rows should be in the same order as the columns of domdata
. The
first vector in the data frame should contain the locus names, and the
second vector in the data frame should contain the numerical alleles.
Use this data frame as colinfo
.codominant.to.dominant
, read.table
, as.matrix
# Create a matrix of dominant data (usually read from a file instead)
mysamples <- c("ind1","ind2","ind3")
myalleles <- c("loc1.100","loc1.102","loc1.104","loc1.106",
"loc2.141","loc2.144","loc2.147","loc2.150")
mydomdata <- matrix(nrow = length(mysamples), ncol = length(myalleles),
dimnames = list(mysamples, myalleles))
mydomdata["ind1",] <- c(1,1,1,0,0,1,1,0)
mydomdata["ind2",] <- c(1,0,0,1,0,0,1,1)
mydomdata["ind3",] <- c(-9,-9,-9,-9,1,1,0,1)
# inspect the matrix
mydomdata
# convert to codominant data
mycodomdata <- dominant.to.codominant(mydomdata)
# view the list created
mycodomdata
# view genotypes by individual
mycodomdata["ind1",]
mycodomdata["ind2",]
mycodomdata["ind3",]
# Alternately, use a matrix without alleles labeled in the colunn names
dimnames(mydomdata)[[2]] <- NULL
mydomdata
# Make a data frame for a locus and allele index
# (Under normal circumstances you would read this from a file)
laindex <- data.frame(Loci = c(rep("loc1",4), rep("loc2",4)),
Alleles = c(100, 102, 104, 106, 141, 144, 147, 150))
laindex
# convert to codominant data
mycodomdata2 <- dominant.to.codominant(mydomdata, colinfo=laindex)
# look at the results
mycodomdata2["ind1",]
# etc.
Run the code above in your browser using DataLab