This function retrieves the names of all genomes available on the NCBI ftp:// server and stores the results in a file named 'overview.txt' inside the directory _ncbi_downloads' that is built inside the workspace.
listGenomes(db = "refseq", type = "all", subset = NULL, details = FALSE)
a character string specifying the database for which genome
availability shall be checked, e.g. db = "refseq"
,
db = "genbank"
, db = "ensembl"
, db = "ensemblgenomes"
.
a character string specifying a potential filter of available
genomes. Options are type = "all"
, type = "kingdom"
,
type = "group"
, type = "subgroup"
.
a character string or character vector specifying a subset of
type
. E.g. if users are interested in retrieving all
Eukaryota
species, they can specify: type = "kingdom"
and
subset = "Eukaryota"
.
a boolean value specifying whether only the scientific names
of stored genomes shall be returned (details = FALSE) or all information such
as organism_name
,kingdoms
, group
, subgroup
,
file_size_MB
, etc.
Internally this function loads the the overview.txt file from NCBI
and creates a directory '_ncbi_downloads' in the temdir()
folder to store the overview.txt file for future processing. In case the
overview.txt file already exists within the '_ncbi_downloads' folder and is
accessible within the workspace, no download process will be performed again.
# NOT RUN {
# print details for refseq
listGenomes(db = "refseq")
# }
Run the code above in your browser using DataLab