seqinr (version 1.0-1)

get.ncbi: Bacterial complete genome data from ncbi ftp site

Description

Try to connect to ncbi ftp site to get a list of complete bacterial genomes.

Usage

get.ncbi(repository = "ftp://ftp.ncbi.nih.gov/genomes/Bacteria/")

Arguments

repository
Where to look for data. The default value is the location of the complete bacterial genome sequences at ncbi ftp repository.

Value

  • Returns a data frame which contains the following columns:
  • speciesThe species name as given by the corresponding folder name in the repository (e.g. Yersinia_pestis_KIM).
  • accessionThe accession number as given by the common prefix of file names in the repository (e.g. NC_004088).
  • size.bpThe size of the sequence in bp (e.g. 4600755).
  • typeA factor with two levels (plasmid or chromosome) temptatively deduced from the description of the sequence.

WARNING

This function is highly dependant on ncbi ftp site conventions for which we have no control. The ftp connection apparently does not work when there is a proxy, this problem is circumvented here in a rather crude way.

References

� To have an overview of the seqinR's functionnality, please consult this vignette: Charif, D., Lobry, J.R. (2005) SeqinR: a contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis. Springer Verlag, Biological and Medical Physics/Biomedical Series, in preparation.

Examples

Run this code
bacteria <- get.ncbi()
summary(bacteria)

Run the code above in your browser using DataLab