seqinr (version 3.6-1)

get.ncbi: Bacterial complete genome data from ncbi ftp site

Description

Try to connect to ncbi ftp site to get a list of complete bacterial genomes.

Usage

get.ncbi(repository = "ftp://ftp.ncbi.nih.gov/genomes/Bacteria/")

Arguments

repository

Where to look for data. The default value is the location of the complete bacterial genome sequences at ncbi ftp repository.

Value

Returns a data frame which contains the following columns:

species

The species name as given by the corresponding folder name in the repository (e.g. Yersinia\_pestis\_KIM).

accession

The accession number as given by the common prefix of file names in the repository (e.g. NC\_004088).

size.bp

The size of the sequence in bp (e.g. 4600755).

type

A factor with two levels (plasmid or chromosome) temptatively deduced from the description of the sequence.

WARNING

This function is highly dependant on ncbi ftp site conventions for which we have no control. The ftp connection apparently does not work when there is a proxy, this problem is circumvented here in a rather crude way.

References

citation("seqinr")

Examples

Run this code
# NOT RUN {
bacteria <- get.ncbi()
# }
# NOT RUN {
summary(bacteria)
# }

Run the code above in your browser using DataCamp Workspace