Learn R Programming

Distance based Cell LinEAge Reconstruction (DCLEAR)

Il-Youp Kwak (ikwak2@cau.ac.kr) and Wuming Gong (gongx030@umn.edu)

R/DCLEAR is an R package for Distance based Cell LinEAge Reconstruction(DCLEAR). These codes are created during the participation of Cell Lineage Reconstruction DREAM challenge.

DCLEAR Overview

Figure 1. Overview of DCLEAR modeling architecture. Our model is divided into two parts, 1) estimating distance between cells and 2) constructing tree using distance matrix.

Estimating distance between cells

Naive approach would be the hamming distance that simply calculate the edit distance.

However, the previous approach assume every base difference have same weights. For example, two sequences, '00AB0' and '0-CB0', are different at second and third positions. The second position, we have '0' and '-', and the third position, we have 'A' and 'C'.

For '0' and '-', '-' is point missing and it is possibly '0'. Thus it should have lower weight. For 'A' and 'C', During the cell propagation, '0' differentiated to 'A' and '0' differentiated to 'C'. Thus it should have larger weight. We can assign weights as below equation.

And we can approximate unknown weights using training data.

K-mer replacement distance

Constructing tree from the distance matrix

With the previously proposed distance matric, we can construct distance matrix among cells. We can apply tree construction algorithms such as Neighbor-Joining(NJ), FastME.

Usage

  • How to use weighted hamming : Link
  • How to use kmer_replacement : Colab Link
  • Preparation for subchallenge 2 submission : Colab Link
  • Preparation for subchallenge 3 submission : link

installation

With 'devtools':

devtools::install_github("ikwak2/DCLEAR")

License

The R/DCLEAR package is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 3, as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details.

A copy of the GNU General Public License, version 3, is available at https://www.r-project.org/Licenses/GPL-3

Presentation

Our talk on the special DREAM session in RECOMB 2020 meeting (https://www.recomb2020.org/) can be found here.

Copy Link

Version

Install

install.packages('DCLEAR')

Monthly Downloads

198

Version

1.0.13

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Il-Youp Kwak

Last Published

September 14th, 2023

Functions in DCLEAR (1.0.13)

get_sequence

get_sequence
get_replacement_probability

get_replacement_probability
get_transition_probability

get_transition_probability
as_lineage_tree

Generic function for as_lineage_tree
dist_replacement,phyDat,missing,integer-method

Compute the kmer replacement distance
dist_replacement,phyDat,kmer_summary,integer-method

Compute the kmer replacement distance
process_sequence

Generic function for process_sequence
prune,igraph-method

prune
sample_mutation_outcome

sample_mutation_outcome
get_leaves,lineage_tree-method

get_leaves
get_distance_prior

get_distance_prior
as_phylo,igraph-method

as_phylo
sample_mutation_site

sample_mutation_site
get_leaves

Generic function for get_leaves
simulate_core

simulate_core
downsample,lineage_tree-method

downsample
substr_kmer,kmer_summary-method

Subseting a kmer_summary object
get_node_names

get_node_names
positional_mutation_prob

positional_mutation_prob
score_simulation

score_simulation
sample_outcome_prob

sample_outcome_prob
lineages

Lineage data
summarize_kmer_core

summarize_kmer_core
rbind,phyDat-method

rbind
random_tree

random_tree
simulate,lineage_tree_config,phyDat-method

simulate
substr_kmer

Generic function for substr_kmer
process_sequence,phyDat-method

Process sequences
subtract,lineage_tree,lineage_tree-method

subtract
summarize_kmer

Generic function for summarize_kmer
summarize_kmer,phyDat-method

summarize_kmer
downsample

Generic function for downsample
prune,lineage_tree-method

prune
prune

Generic function for prune
subtree

Generic function for subtree
simulate

Generic function for simulate
subtree,phylo-method

subtree
simulate,lineage_tree_config,missing-method

simulate
subtract

Generic function for subtract
sim_seqdata

sim_seqdata
subtree,lineage_tree-method

subtree
as_igraph

Generic function for as_igraph
add_dropout

add_dropout
add_deletion

add_deletion
WH_train_fit

Train weights for WH, and output distance object
as_phylo

Generic function for as_phylo
as_igraph,data.frame-method

as_igraph
as_lineage_tree,phyDat,phylo,lineage_tree_config-method

as_lineage_tree
WH_train

Train weights for WH
dist_kmer_replacement_inference

Core function of computing kmer replacement distance
WH

WH
DCLEAR

DCLEAR: A package for DCLEAR: Distance based Cell LinEAge Reconstruction
dist_replacement

Generic function for dist_replacement
downsample,igraph-method

downsample
as_igraph,phylo-method

as_igraph
dist_weighted_hamming,phyDat,numeric-method

dist_weighted_hamming
dist_weighted_hamming

Generic function for dist_weighted_hamming