buildPhylipLineage reconstructs an Ig lineage via maximum parsimony using the
dnapars application of the PHYLIP package.buildPhylipLineage(clone, dnapars_exec, rm_temp = FALSE, verbose = FALSE)ChangeoClone object containing clone data.TRUE delete the temporary directory after running dnapars;
if FALSE keep the temporary directory.FALSE suppress the output of dnapars;
if TRUE STDOUT and STDERR of dnapars will be passed to
the console.graph object defining the Ig lineage tree. Each unique input
sequence in clone is a vertex of the tree, with additional vertices being
either the germline (root) sequences or inferred intermediates. The graph
object has the following attributes.
Vertex attributes:
name: value in theSEQUENCE_IDcolumn of thedataslot of the inputclonefor observed sequences.
The germline (root) vertex is assigned the name
"Germline" and inferred intermediates are assigned
names with the format {"Inferred1", "Inferred2", ...}.sequence: value in theSEQUENCEcolumn of thedataslot of the inputclonefor observed sequences.
The germline (root) vertex is assigned the sequence
in thegermlineslot of the inputclone.
The sequence of inferred intermediates are extracted
from the dnapars output.label: same as thenameattribute.data slot of the input
clone is added as a vertex attribute with the attribute name set to
the source column name. For the germline and inferred intermediate vertices,
these additional vertex attributes are all assigned a value of NA.
Edge attributes:
weight: Hamming distance between thesequenceattributes
of the two vertices.label: same as theweightattribute.clone: clone identifier from thecloneslot of the
inputChangeoClone.v_gene: V-segment gene call from thev_geneslot of
the inputChangeoClone.j_gene: J-segment gene call from thej_geneslot of
the inputChangeoClone.junc_len: junction length (nucleotide count) from thejunc_lenslot of the inputChangeoClone.buildPhylipLineage builds the lineage tree of a set of unique Ig sequences via
maximum parsimony through an external call to the dnapars application of the PHYLIP
package. dnapars is called with default algorithm options, except for the search option,
which is set to "Rearrange on one best tree". The germline sequence of the clone is used
for the outgroup. Following tree construction using dnapars, the dnapars output is modified to allow input sequences to appear as internal nodes of the tree. Intermediate sequences inferred by dnapars are replaced by children within the tree having a Hamming distance of zero from their parent node. The distance calculation allows IUPAC ambiguous character matches, where an ambiguous character has distance zero to any character in the set of characters it represents. Distance calculation and movement of child nodes up the tree is repeated until all parent-child pairs have a distance greater than zero between them. The germline sequence (outgroup) is moved to the root of the tree and excluded from the node replacement processes, which permits the trunk of the tree to be the only edge with a distance of zero. Edge weights of the resultant tree are assigned as the distance between each sequence.
ChangeoClone.
Temporary directories are created with makeTempDir.
Distance is calculated using getSeqDistance.
See igraph and igraph.plotting for working
with igraph graph objects.# Load example data
file <- system.file("extdata", "ExampleDb.gz", package="alakazam")
df <- readChangeoDb(file)
# Preprocess clone
clone <- subset(df, CLONE == 164)
clone <- makeChangeoClone(clone, text_fields=c("SAMPLE", "ISOTYPE"), num_fields="DUPCOUNT")
# Run PHYLIP and process output
dnapars_exec <- "~/apps/phylip-3.69/dnapars"
graph <- buildPhylipLineage(clone, dnapars_exec, rm_temp=TRUE)
# Plot graph with a tree layout
library(igraph)
ly <- layout_as_tree(graph, root="Germline", circular=F, flip.y=T)
plot(graph, layout=ly)Run the code above in your browser using DataLab