rs.compute computes similarity of two reactions.
rs.compute.list computes similarity of two lists of reactions.
rs.compute.sim.matrix computes similarity of reactions in a list.
rs.compute.DB computes similarity of a reaction against a database (parsed from text file).
rs.compute (rxnA, rxnB, format = 'rsmi', standardize = T, explicitH = F, reversible = T, algo = 'msim', sim.method = 'tanimoto', fp.type = 'extended', fp.mode = 'bit', fp.depth = 6, fp.size = 1024, verbose = F, fpCached = F)
rs.compute.list (rxnA, rxnB, format = 'rsmi', standardize = T, explicitH = F, reversible = T, algo = 'msim', sim.method = 'tanimoto', fp.type = 'extended', fp.mode = 'bit',fp.depth = 6, fp.size = 1024, clearCache = T)
rs.compute.sim.matrix (rxnA, format = 'rsmi', standardize = T, explicitH = F, reversible = T, algo = 'msim', sim.method = 'tanimoto', fp.type = 'extended', fp.mode = 'bit', fp.depth = 6, fp.size = 1024, clearCache = T)
rs.compute.DB (rxnA, DB, format = 'rsmi', ecrange = '*', reversible = T, algo = 'msim', sim.method = 'tanimoto', sort = T, fpCached = F)rs.compute.list and rs.compute.sim.matrix accept list of reactions as input.rs.compute.list accepts list of reactions as input.rs.makeDB.TRUE (default).TRUE. It is set as FALSE by default.TRUE (default), reaction(s) are aligned by comparing them in forward direction and by reversing one of them to compute maximum similarity value.'msim' (default), 'msim_max', 'rsim' and 'rsim2'. See description for the details of the algorithms.'simple', 'jaccard', 'tanimoto' (default), 'russelrao', 'dice', 'rodgerstanimoto', 'achiai', 'cosine', 'kulczynski2', 'mt', 'baroniurbanibuser', 'tversky', 'robust', 'hamann', 'pearson', 'yule', 'mcconnaughey', 'simpson', 'jaccard-count' and 'tanimoto-count'.'standard', 'extended' (default), 'graph', 'estate', 'hybridization', 'maccs', 'pubchem', 'kr', 'shortestpath', 'signature' and 'circular'.'bit' (default) or 'count'.'pubchem', 'maccs', 'kr' and 'estate' fingerprints.'pubchem', 'maccs', 'kr', 'estate', 'circular' (count mode) and 'signature' fingerprints.'rsim2' algorithm.rs.compute.DB to return data frame sorted based upon decreasing value of similarities.FALSE by default.TRUE by default. Cache can also be explicitly cleared using rs.clearCache.rs.computers.compute.listrs.compute.sim.matrixrs.compute.DBmsim, msim_max, rsim and rsim2.
msim0 similarity value is assigned to each unpaired molecule. Reaction similarity is then computed by averaging the similarity values for each pair of equivalent molecule(s) and unpaired molecule(s). Molecule equivalences computed can be reviewed using verbose mode in rs.compute.
msim_maxmsim except that the unpaired molecules are not used for computing average.
rsim
rsim2
For reversible reactions (reversible = TRUE), apart from comparing reactions in the forward direction they are also compared by reversing one of the reactions. The greater of the two similarity values is reported.
Fingerprint Caching
rs.compute and rs.compute.DB functions can use fingerprint caching. If fpCached is set as TRUE, cache is queried first before generating fingerprints. Any new fingerprint generated is stored in the cache. Setting fpCached = FALSE makes no change to cache. Cache can be cleared by calling rs.clearCache.
rs.compute.list and rs.compute.sim.matrix functions internally use caching. To ensure consistency of fingerprints, rs.clearCache is called internally. Use clearCache = FALSE to override this behaviour; it will use current state of cache and add new fingerprints to it.
Same cache is used for all functions.
Similarity metric included in RxnSim. These metric (except jaccard-count and tanimoto-count) are derived from fingerprint pacakge.
| ID | Name |
| Remarks |
simple |
| Sokal & Michener | bit |
jaccard |
Jaccard |
| bit |
tanimoto |
| Tanimoto (bit) | bit and count |
jaccard-count |
Jaccard (count) |
| count |
tanimoto-count |
| Tanimoto (count) | count ^ |
dice |
Dice (bit) |
| bit and count |
russelrao |
| Russel And Rao | bit |
rodgerstanimoto |
Roger And Tanimoto |
| bit |
achiai |
| Ochiai | bit |
cosine |
Cosine |
| bit |
kulczynski2 |
| Kulczynski 2 | bit |
mt |
Modified Tanimoto |
| bit |
baroniurbanibuser |
| Baroni-Urbani/Buser | bit |
robust |
Robust (bit) |
| bit and count |
tversky |
| Tversky* | bit |
hamann |
Hamann |
| bit |
pearson |
| Pearson | bit |
yule |
Yule |
| bit |
mcconnaughey |
| McConnaughey | bit |
simpson |
Simpson |
| bit | ID |
c('tversky', a, b).tanimoto (bit), dice (bit) and robust (bit) compute similarity of feature vectors (count mode) by translating them to equivalent fingerprint vectors. Default similarity metric used is tanimoto.
List of fingerprints included in RxnSim. These are derived from rCDK package.
| ID | Name of the Fingerprint |
| Mode |
standard |
| Standard | bit |
extended |
Extended |
| bit |
estate |
| EState | bit |
graph |
Graphonly |
| bit |
hybridization |
| Hybridization | bit |
maccs |
MACCS |
| bit |
pubchem |
| Pubchem | bit |
kr |
Klekota-Roth |
| bit |
shortestpath |
| Shortestpath | bit |
signature |
Signature |
| count |
circular |
| Circular | bit and count |
rs.makeDB, rs.clearCache, ms.compute
## Not run: Reaction similarity using \'msim\' algorithm
rs.compute(rct1, rct2, verbose = TRUE)
Run the code above in your browser using DataLab