Read nucleic acid sequences from a file in FASTA or GBK format.
read.all(file = system.file(""), seqtype = "DNA")
The returned list has a component Sequence containing the DNA sequence taken from the field ``ORIGIN'' in GenBank. The sequence is a vector of single characters.
the returned list has a component Locus/Accession containing the names of the locus or accession number taken from the field ``LOCUS'' or ``ACCESSION'' in 'GenBank'. Also, return sequence length.
The name of the file which the sequences in FASTA or GBK format are to be read from.
The nature of the sequence. Nowadays only DNA, in further updates it will be possible to use for different type of sequences.
Nora M. Villanueva and Marta Sestelo.
Fasta is a widely used format in molecular biology. Sequence in FASTA format starts with a single-line description, distinguished by a greater-than `>' symbol, followed by sequence data on the next lines.
'GenBank' format files have the extension GBK, by convention. Files contain fields with different types of information well-labeled. The header of the file has information describing the sequence, such as its type, shape, length and source. The features of the genome sequence follow the header, and include protein translations. The DNA sequence is the last element of the file, which ends with (and must include) a soluble slash. Complete genomes in this format are available at the https://ftp.ncbi.nlm.nih.gov/genbank/.
library(seq2R)
data(mtDNAhum)
if (FALSE) {
data<-read.all("file.fasta")
data<-read.all("file.gbk")
}
Run the code above in your browser using DataLab