cape
cape
analysis and formats it into a genotype object used by other functions in cape
. The file can be in cape format (See read.population
), a csv file, or a compressed RData file generated by saveRDS
. See Details for further descriptions of the files.
read.geno(file.format = c("cape", "csv", "rdata"),
filename = NULL, geno.col = NULL, delim = ",",
na.strings = "-", check.chr.order = TRUE)
read.population
and read.geno
, the function make.data.obj
should be run to transfer marker sinformation from the geno.obj to the data.obj.
read.population
.
The csv format must contain the following:
header
: A header labeling each column is required. The headers typically contain a name for each marker, for example "D15MIT80."
chromosomes
: The second line of the file must contain the chromosome on which each marker is found.
marker location
: The third line of the file must contain the chromosomal locations of the markers.
genotypes
: Genotypes may be coded in one of three different formats: (1) As letters, for example A,H,B, indicating homozygous for allele 1, heterozygous, and homozygous for allele 2 respectively. "H" must be used for heterozygotes, but the other genotypes may be coded with any other letters. (2) As the numbers 0,1,2 indicating homozygous for allele 1, heterozygous, and homozygous for allele 2 respectively. (3) As continuous probabilities of the presence of the reference allele. An individual homozygous for allele 1 would be coded as 0, a heterozygous individual as 0.5, and an individual homozygous for allele 2 as 1. The continuous probabilities allow for uncertainty in genotyping that is not automatically available in the A,H,B or 0,1,2 encodings.
The rdata format follows the same format as the csv file, but is used for large data that cannot be reasonably stored in csv format. The file should be saved using the function saveRDS
read.population
, read.pheno
, make.data.obj