mafft: DNA Sequence Alignment with MAFFT

Description

This function is a wrapper for MAFFT and can be used for sequence and profile aligning.

Usage

mafft(x, y, add, method = "auto", maxiterate = 0, op = 1.53, 
    ep = 0.0, gt, options, path, quiet)

Arguments

An object of class DNAbin.

An object of class DNAbin, if given both x and y are preserved and aligned to each other ("profile alignment").

add

A character string giving the method used for adding y to x: "add", "addprofile" (default), or any unambiguous abbreviation of these.

method

A character string giving the alignment method. Available accuracy-oriented methods for less than 200 sequences are "localpair", "globalpair", and "genafpair" as well as "retree 1" and "retree 2"for speed-oriented alignment. The default is "auto", which lets MAFFT choose an opproriate alignment method.

maxiterate

An integer giving the number of cycles of iterative refinement to perform. Possible choices are 0: progressive method, no iterative refinement (default); 2: two cycles of iterative refinement; 1000: at most 1000 cycles of iterative refinement.

A numeric giving the gap opening penalty at group-to-group alignment; default 1.53.

A numeric giving the offset value, which works like gap extension penalty, for group-to-group alignment; default 0.0, but 0.123 is recommende if no long indels are expected.

An object of class phylo that is to be used as a guide tree during alignment.

options

A vector of mode character specifying addional arguments to MAFFT, that are not included in mafft such as, e.g., --adjustdirection.

path

A character string indicating the path to the MAFFT executable.

quiet

Logical, if set to TRUE, mafft progress is printed out on the screen.

Value

A matrix of class "DNAbin".

Details

"localpair" selects the L-INS-i algorithm, probably most accurate; recommended for <200 sequences; iterative refinement method incorporating local pairwise alignment information.

"globalpair" selects the G-INS-i algorithm suitable for sequences of similar lengths; recommended for <200 sequences; iterative refinement method incorporating global pairwise alignment information.

"genafpair" selects the E-INS-i algorithm suitable for sequences containing large unalignable regions; recommended for <200 sequences.

"retree 1" selects the FFT-NS-1 algorithm, the simplest progressive option in MAFFT; recommended for >200 sequences.

"retree 2" selects the FFT-NS-2 algorithm that uses a second iteration of alignment based on a guide tree computed from an FFT-NS-1 aligment; this is the default in MAFFT; recommended for >200 sequences.

References

Katoh, K. and H. Toh. 2008. Recent developments in the MAFFT multiple sequence alignment program. Briefings in Bioinformatics 9: 286-298.

Katoh, K., K.-i. Kuma, H. Toh, and T. Miyata. 2005. Mafft version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Research 33: 511--518.

Katoh, K., K. Misawa, K.-i. Kuma, and T. Miyata. 2002. Mafft: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleid Acids Research 30: 3059--3066.

http://mafft.cbrc.jp/alignment/software/index.html