getGenome

a character string specifying the database from which the genome 
shall be retrieved:<ul>
<li><code>db = "refseq"</code></li>
<li><code>db = "genbank"</code></li>
<li><code>db = "ensembl"</code></li>
</ul>

there are three options to characterize an organism:<ul>
<li>by <code>scientific name</code>: e.g. <code>organism = "Homo sapiens"</code></li>
<li>by <code>database specific accession identifier</code>: e.g. <code>organism = "GCF_000001405.37"</code> (= NCBI RefSeq identifier for <code>Homo sapiens</code>)</li>
<li>by <code>taxonomic identifier from NCBI Taxonomy</code>: e.g. <code>organism = "9606"</code> (= taxid of <code>Homo sapiens</code>)</li>
</ul>

organism

a logical value indicating whether or not a genome shall be downloaded if it isn't marked in the database as either a reference genome or a representative genome.

reference

the database release version of ENSEMBL (<code>db = "ensembl"</code>). Default is <code>release = NULL</code> meaning
that the most recent database version is used.

release

a logical value indicating whether or not files should be unzipped.

gunzip

a character string specifying the location (a folder) in which 
the corresponding genome shall be stored. Default is 
<code>path</code> = <code>file.path("_ncbi_downloads","genomes")</code>.

path

Main genome retrieval function for an organism of interest.
By specifying the scientific name of an organism of interest the 
corresponding fasta-file storing the genome of the organism of interest
can be downloaded and stored locally. Genome files can be retrieved from 
several databases. In addition, the genome summary statistics for the 
retrieved species is stored locally to provide users with 
insights regarding the genome assembly quality (see <code><a rd-options="" href="/link/summary_genome?package=biomartr&version=0.9.0" data-mini-rdoc="biomartr::summary_genome">summary_genome</a></code> for details).
This is useful when comparing genomes with large difference in genome assembly qualities.

Perform large scale genomic data retrieval and functional annotation retrieval. This package aims to provide users with a standardized
way to automate genome, proteome, 'RNA', coding sequence ('CDS'), 'GFF', and metagenome
retrieval from 'NCBI RefSeq', 'NCBI Genbank', 'ENSEMBL', 'ENSEMBLGENOMES',
and 'UniProt' databases. Furthermore, an interface to the 'BioMart' database
(Smedley et al. (2009) <doi:10.1186/1471-2164-10-22>) allows users to retrieve
functional annotation for genomic loci. In addition, users can download entire databases such
as 'NCBI RefSeq' (Pruitt et al. (2007) <doi:10.1093/nar/gkl842>), 'NCBI nr',
'NCBI nt', 'NCBI Genbank' (Benson et al. (2013) <doi:10.1093/nar/gks1195>), etc. as
well as 'ENSEMBL' and 'ENSEMBLGENOMES' with only one command.

Hajk-Georg Drost

biomartr

Genomic Data Retrieval

getGenome function

Main genome retrieval function for an organism of interest.
By specifying the scientific name of an organism of interest the 
corresponding fasta-file storing the genome of the organism of interest
can be downloaded and stored locally. Genome files can be retrieved from 
several databases. In addition, the genome summary statistics for the 
retrieved species is stored locally to provide users with 
insights regarding the genome assembly quality (see <code><a rd-options='' href='summary_genome'>summary_genome</a></code> for details).
This is useful when comparing genomes with large difference in genome assembly qualities.

getGenome: Genome Retrieval

Description

Usage

Arguments

Value

Details

See Also

Examples