Learn R Programming

biomartr (version 0.5.1)

getGenome: Genome Retrieval

Description

Main genome retrieval function for an organism of interest. By specifying the scientific name of an organism of interest the corresponding fasta-file storing the genome of the organism of interest can be downloaded and stored locally. Genome files can be retrieved from several databases.

Usage

getGenome(db = "refseq", organism, path = file.path("_ncbi_downloads",
  "genomes"))

Arguments

db
a character string specifying the database from which the genome shall be retrieved:
  • db = "refseq"
  • db = "genbank"
  • db = "ensembl"
  • db = "ensemblgenomes"
organism
a character string specifying the scientific name of the organism of interest, e.g. organism = "Homo sapiens".
path
a character string specifying the location (a folder) in which the corresponding genome shall be stored. Default is path = file.path("_ncbi_downloads","genomes").

Value

File path to downloaded genome.

Details

Internally this function loads the the overview.txt file from NCBI:

refseq: ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/

genbank: ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/

and creates a directory '_ncbi_downloads/genomes' to store the genome of interest as fasta file for future processing. In case the corresponding fasta file already exists within the '_ncbi_downloads/genomes' folder and is accessible within the workspace, no download process will be performed.

See Also

getProteome, getCDS, getGFF, getRNA, meta.retrieval, read_genome

Examples

Run this code
## Not run: ------------------------------------
# 
# # download the genome of Arabidopsis thaliana from refseq
# # and store the corresponding genome file in '_ncbi_downloads/genomes'
# file_path <- getGenome( db       = "refseq", 
#              organism = "Arabidopsis thaliana", 
#              path = file.path("_ncbi_downloads","genomes"))
# 
# Ath_genome <- read_genome(file_path, format = "fasta")
# 
# 
# # download the genome of Arabidopsis thaliana from genbank
# # and store the corresponding genome file in '_ncbi_downloads/genomes'
# file_path <- getGenome( db       = "genbank", 
#              organism = "Arabidopsis thaliana", 
#              path = file.path("_ncbi_downloads","genomes"))
# 
# Ath_genome <- read_genome(file_path, format = "fasta")
## ---------------------------------------------

Run the code above in your browser using DataLab