AlignSeqs(myXStringSet, guideTree = NULL, iterations = 1, refinements = 1, gapOpening=c(-16, -12), gapExtension=c(-2, -1), structures = NULL, FUN = AdjustAlignment, levels = c(0.95, 0.7, 10, 5), processors = 1, verbose = TRUE, ...)AAStringSet, DNAStringSet, or RNAStringSet object of unaligned sequences.
NULL or a data.frame giving the ordered tree structure in which to align profiles. If NULL then a guide tree will be automatically constructed based on the order of shared k-mers.
structureMatrix, such as that output by PredictHEC, or NULL to generate the structures automatically. Only applicable if myXStringSet is an AAStringSet.
FUN. (See details section below.)
NULL to automatically detect and use all available processors.
AlignProfiles, including perfectMatch, misMatch, gapPower, terminalGap, restrict, anchor, normPower, substitutionMatrix, and structureMatrix.
XStringSet of aligned sequences.
guideTree=NULL, an initial single-linkage guide tree is constructed based on a distance matrix of shared k-mers. If an initial guideTree is provided then the guideTree should be provided in the output given by IdClusters with ascending levels of cutoff. (2) If iterations is greater than zero, then a UPGMA guide tree is built based on the initial alignment and the sequences are re-aligned along this tree. This process repeated iterations times or until convergence. (3) If refinements is greater than zero, then groups of sequences are iteratively realigned to the full-alignment. This process generates two alignments, the best of which is chosen based on its sum-of-pairs score. This refinement process is repeated refinements times, or until no improvement can be made.The FUN function is applied during each of the three steps based on levels. The purpose of levels is to speed-up the alignment process by not running FUN on the alignment when it is unnecessary. The default levels specify that FUN should be run on the sequences when the initial tree is above 0.95 average dissimilarity, when the iterative tree is above 0.7 average dissimilarity, and after every tenth improvement made during refinement. The final element of levels prevents FUN from being applied at any point to less than 5 sequences. The FUN function is always applied just before returning the alignment, independently of the first three values of levels. The default FUN is AdjustAlignment, but FUN accepts any function that takes in an XStringSet as its first argument, and weights, processors, and substitutionMatrix as optional arguments. For example, the default FUN could be altered to not perform any function by setting it equal to FUN=function(x, ...) return(x) where x is an XStringSet.
AdjustAlignment, AlignDB, AlignProfiles, AlignSynteny, AlignTranslation, IdClusters, StaggerAlignment
db <- system.file("extdata", "Bacteria_175seqs.sqlite", package="DECIPHER")
dna <- SearchDB(db, remove="all")
alignedDNA <- AlignSeqs(dna)
BrowseSeqs(alignedDNA, highlight=1)
Run the code above in your browser using DataLab