dia.bactgensize: Distribution of bacterial genome size from GOLD

Description

This function tries to download the last update of the GOLD (Genomes OnLine Database) to extract bacterial genomes sizes when available. The histogram and the default density() output is produced. Optionally, a maximum likelihood estimate of a superposition of two or three normal distributions is also represented

Usage

dia.bactgensize(fit = 2, p = 0.5, m1 = 2000, sd1 = 600, m2 = 4500,
       sd2 = 1000, p3 = 0.05, m3 = 9000, sd3 = 1000)

Arguments

fit

integer value. If fit == O no normal fit is produced, if fit == 2 try to fit a superposition of two normal distributions, if fit == 3 try to fit a superposition of three normal distribution.

initial guess for the proportion of the first population.

initial guess for the mean of the first population.

sd1

initial guess for the standard deviation of the first population.

initial guess for the mean of the second population.

sd2

initial guess for the standard deviation of the second population.

initial guess for the proportion of the third population.

initial guess for the mean of the third population.

sd3

initial guess for the standard deviation of the second population.

Value

An invisible dataframe with three components:
comp1genus name
comp2species names
comp3genome size in Kb

References

� To have an overview of the seqinR's functionnality, please consult this vignette: Charif, D., Lobry, J.R. (2005) SeqinR: a contributed package to the R project for statistical computing devoted to biological sequences retrieval and analysis. Springer Verlag, Biological and Medical Physics/Biomedical Series, in preparation.

Examples

Run this code

dia.bactgensize()