Learn R Programming

dada2

Exact sample inference from Illumina amplicon data. Resolves real variants differing by as little as one nucleotide.

DADA2 Articles

Preprint describing DADA2's accuracy and precision: DADA2: High resolution sample inference from amplicon data

Peer reviewed publication: In progress.

Installation

source("https://bioconductor.org/biocLite.R")
biocLite("dada2")

Tutorial

A walkthrough of the DADA2 pipeline for paired end Illumina Miseq data is available on the DADA2 front page.

Other Resources

Planned feature improvements are publicly catalogued at the main DADA2 development site on github; specifically on the "Issues" page for DADA2:

https://github.com/benjjneb/dada2/issues

If the feature you are hoping for is not listed, you are welcome to add it as a feature request "issue" on this page. This request will be publicly available and listed on the page.

Bugs and difficulties in using DADA2 are also welcome on the issue tracker.

Copy Link

Version

Version

1.0.3

License

LGPL-3

Maintainer

Benjamin Callahan

Last Published

February 15th, 2017

Functions in dada2 (1.0.3)

Classifies sequences against reference training dataset.

An empirical error matrix.

An empirical error matrix.

dada_to_seq_table

Map denoised sequence to each read.

Merge forward and reverse reads after DADA denoising, even if reads were not originally ordered together.

Use a loess fit to estimate error rates from transition counts.

An empirical error matrix.

Inflates an error rate matrix by a specified factor, while accounting for saturation.

Get DADA options

Internal tables function

An empirical error matrix.

Plot observed error rates after denoising.

plotQualityProfile

Plot quality profile of a fastq file.

The named integer vector format used to represent collections of unique DNA sequences.

show,derep-method

method extensions to show for dada2 objects.

collapseNoMismatch

Combine together sequences that are identical up to shifts and/or length.

An empirical error matrix.

Identify sequences that are identical to a more abundant sequence up to an overall shift.

Get vector of sequences from input object.

A class representing dereplicated sequences

Determine if input sequence is a bimera of putative parent sequences.

An empirical error matrix.

Read in and dereplicate a fastq file.

The object class returned by dada

Write a uniques vector to a FASTA file

Needlman-Wunsch alignment.

Hamming distance after Needlman-Wunsch alignment.

Identify bimeras from collections of unique sequences.

Filter and trim a fastq file.

High resolution sample inference from amplicon data.

Generate the kmer-distance and the alignment distance from the given set of sequences.

Determine if input sequence(s) match the phiX genome.

fastqPairedFilter

Filters and trims paired forward and reverse fastq files.

An empirical error matrix.

Set DADA options

removeBimeraDenovo

Remove bimeras from collections of unique sequences.

makeSequenceTable

Construct a sample-by-sequence observation matrix.

Merge denoised forward and reverse reads.

plotComplementarySubstitutions

Plot Substitution Pairs from DADA Result

Get the uniques-vector from the input object.