Learn R Programming

evobiR (version 1.0)

AnalyzeAssembly: Analyze Genome Assembly

Description

This function reports basic statistics for a genome assembly.

Usage

AnalyzeAssembly(genome, max_N = 25, plot = F)

Arguments

genome
a list of vectors with each element being a single string of the class "SeqFastadna".
max_N
Maximum number of consecutive N symbols. Scaffolds will be broken into contigs when this number is exceeded.
plot
When True an accumulation plot will be returned as well as the statistics

Value

  • A dataframe with the following rows: Number of Scaffolds Assembly Size Based on Scaffolds Number of Scaffolds over 1MB N50 Scaffold Size Number of Contigs Assembly Size Based on Contigs N50 Contig Size Minimum Contig Size Percent GC

Details

If a standard FASTA file is read in with the function read.fasta from the package seqinr the argument as.string should set to TRUE. The genome should also be all lower case which is the default setting for read.fasta.

References

http://www.uta.edu/karyodb/evobiR/

Examples

Run this code
## just a small simulated genome
data(genome)
## calculate summary statistics for the genome
AnalyzeAssembly(genome = genome, max_N = 25, plot = TRUE)

Run the code above in your browser using DataLab