read_geno_csv: Data Input in CSV format

Description

Reads an external comma-separated values (CSV) data file. The format of the file is described in the Details section. This function creates an object of class mappoly.data.

Usage

read_geno_csv(
  file.in,
  ploidy,
  filter.non.conforming = TRUE,
  elim.redundant = TRUE,
  verbose = TRUE
)

Value

An object of class mappoly.data which contains a list with the following components:

ploidy: ploidy level
n.ind: number individuals
n.mrk: total number of markers
ind.names: the names of the individuals
mrk.names: the names of the markers
dosage.p1: a vector containing the dosage in parent P for all n.mrk markers
dosage.p2: a vector containing the dosage in parent Q for all n.mrk markers
chrom: a vector indicating which sequence each marker belongs. Zero indicates that the marker was not assigned to any sequence
genome.pos: Physical position of the markers into the sequence
seq.ref: NULL (unused in this type of data)
seq.alt: NULL (unused in this type of data)
all.mrk.depth: NULL (unused in this type of data)
geno.dose: a matrix containing the dosage for each markers (rows) for each individual (columns). Missing data are represented by ploidy_level + 1
n.phen: number of phenotypic traits
phen: a matrix containing the phenotypic data. The rows correspond to the traits and the columns correspond to the individuals
kept: if elim.redundant = TRUE, holds all non-redundant markers
elim.correspondence: if elim.redundant = TRUE, holds all non-redundant markers and its equivalence to the redundant ones

Arguments

file.in: a character string with the name of (or full path to) the input file containing the data to be read
ploidy: the ploidy level
filter.non.conforming: if TRUE (default) converts data points with unexpected genotypes (i.e. no double reduction) to 'NA'. See function segreg_poly for information on expected classes and their respective frequencies.
elim.redundant: logical. If TRUE (default), removes redundant markers during map construction, keeping them annotated to export to the final map.
verbose: if TRUE (default), the current progress is shown; if FALSE, no output is produced

Author

Marcelo Mollinari, mmollin@ncsu.edu, with minor changes by Gabriel Gesteira, gdesiqu@ncsu.edu

Details

This is an alternative and a somewhat more straightforward version of the function read_geno. The input is a standard CSV file where the rows represent the markers, except for the first row which is used as a header. The first five columns contain the marker names, the dosage in parents 1 and 2, the chromosome information (i.e. chromosome, scaffold, contig, etc) and the position of the marker within the sequence. The remaining columns contain the dosage of the full-sib population. A tetraploid example of such file can be found in the Examples section.

References

Mollinari M., Olukolu B. A., Pereira G. da S., Khan A., Gemenet D., Yencho G. C., Zeng Z-B. (2020), Unraveling the Hexaploid Sweetpotato Inheritance Using Ultra-Dense Multilocus Mapping, _G3: Genes, Genomes, Genetics_. tools:::Rd_expr_doi("10.1534/g3.119.400620")

Mollinari, M., and Garcia, A. A. F. (2019) Linkage analysis and haplotype phasing in experimental autopolyploid populations with high ploidy level using hidden Markov models, _G3: Genes, Genomes, Genetics_. tools:::Rd_expr_doi("10.1534/g3.119.400378")

Examples

Run this code

# \donttest{
#### Tetraploid Example
ft = "https://raw.githubusercontent.com/mmollina/MAPpoly_vignettes/master/data/tetra_solcap.csv"
tempfl <- tempfile()
download.file(ft, destfile = tempfl)
SolCAP.dose <- read_geno_csv(file.in  = tempfl, ploidy = 4)
print(SolCAP.dose, detailed = TRUE)
plot(SolCAP.dose)
# }

Run the code above in your browser using DataLab