"snp.matrix"
.read.snps.long(files, sample.id = NULL, snp.id = NULL, female = NULL,
fields = c(sample = 1, snp = 2, genotype = 3, confidence = 4),
codes = c("0", "1", "2"), threshold = 0.9, lower = TRUE,
sep = " ", comment = "#", skip = 0, simplify = c(FALSE,FALSE),
verbose = FALSE, every = 1000)
sample.id
should specify whether each sample was from a
female subjectsample
and snp
for the sample
and SNP identifier fields, confidence
for a call confidence
score (if present) and either genotype
if genotype calls
occur as a single field, or allele1
and allele2
if the
two alleles are coded in different fields"nucleotide"
denoting
that coding in terms of nucleotides
(A
, C
, G
or T
, case insensitive),
or a character vector
giving genotype or allele codes (see below)TRUE
, then threshold
represents a lower
bound. Otherwise it is an upper boundTRUE
, sample and SNP identifying strings
will be shortened by removal of any common leading or trailing
sequences when they are used as row and column names of the output
snp.matrix
TRUE
, a progress report is generated as
every every
lines of data are readverbose
"snp.matrix"
or "X.snp.matrix"
.codes
argument
should be a character array giving the valid codes.
For genotype coding of autosomal SNPs, this should be
an array of length 3 giving the codes
for the three genotypes, in the order homozygous(AA), heterozygous(AB),
homozygous(BB). All other codes will be treated
as "no call". The default codes are "0"
, "1"
,
"2"
. For X SNPs, males are assumed to be coded as homozygous,
unless an additional two codes are supplied (representing the
AY and BY genotypes). For allele coding, the
codes
array should be of length 2 and should specify the codes
for the two alleles. Again, any other code is treated as
"missing" and, for X SNPs, males should be coded either as
homozygous or by omission of the second allele. Although the function allows for reading of data for the X chromosome
directly into an object of class "X.snp.matrix"
,
it will often be preferable to read such data as a "snp.matrix"
(i.e. as autosomal) and to coerce it to an object of type
"X.snp.matrix"
later using as(..., "X.snp.matrix")
or
new("X.snp.matrix", ..., female=...)
.
The vectors sample.id
and snp.id
must be in the same
order as they vary on the input file(s) and this ordering must be
consistent. However, there is
no requirement that either SNP or sample should vary fastest; this is
detected from the input.
Each file may represent a separate sample or SNP, in which case the
appropriate .id
argument can be omitted and row or column names
taken from the file names.
read.HapMap.data
read.snps.pedfile
,
read.snps.chiamo
,read.snps.long
,
snp.matrix-class
, X.snp.matrix-class