Learn R Programming

RxnSim (version 1.0.1)

ms.compute: Computes Similarity of Molecules

Description

Computes chemical similarity between two (or more) input molecules.

Usage

ms.compute (molA, molB, format = 'smiles', standardize = T, explicitH = F, sim.method = 'tanimoto', fp.type = 'extended', fp.mode = 'bit', fp.depth = 6, fp.size =  1024, fpCached = F) ms.compute.sim.matrix (molA, format = 'smiles', standardize = T, explicitH = F, sim.method = 'tanimoto', fp.type = 'extended', fp.mode = 'bit', fp.depth = 6, fp.size = 1024, clearCache = T)

Arguments

molA
input molecule in SMILES format or name (with path) of MDL MOL file. ms.compute.sim.matrix accepts list of molecules as input.
molB
input molecule in SMILES format or name (with path) of MDL MOL file.
format
specifies format of input molecule(s). Molecule(s) can be provided in one of following formats: 'SMILES' (default) or 'MOL'.
standardize
suppresses all explicit hydrogen if set as TRUE (default).
explicitH
converts all implicit hydrogen to explicit if set as TRUE. It is set as FALSE by default.
sim.method
similarity metric to be used to evaluate molecule similarity. Allowed types include: 'simple', 'jaccard', 'tanimoto' (default), 'russelrao', 'dice', 'rodgerstanimoto', 'achiai', 'cosine', 'kulczynski2', 'mt', 'baroniurbanibuser', 'tversky', 'robust', 'hamann', 'pearson', 'yule', 'mcconnaughey', 'simpson', 'jaccard-count' and 'tanimoto-count'.
fp.type
fingerprint type to use. Allowed types include: 'standard', 'extended' (default), 'graph', 'estate', 'hybridization', 'maccs', 'pubchem', 'kr', 'shortestpath', 'signature' and 'circular'.
fp.mode
fingerprint mode to be used. It can either be set to 'bit' (default) or 'count'.
fp.depth
search depth for fingerprint construction. This argument is ignored for 'pubchem', 'maccs', 'kr' and 'estate' fingerprints.
fp.size
length of the fingerprint bit string. This argument is ignored for 'pubchem', 'maccs', 'kr', 'estate', 'circular' (count mode) and 'signature' fingerprints.
fpCached
boolean that enables fingerprint caching. It is set to FALSE by default.
clearCache
boolean that resets the cache before (and after) processing molecule lists. It is set to TRUE by default. Cache can also be explicitly cleared by using rs.clearCache.

Value

Returns similarity value(s).
ms.compute
returns a similarity value.
ms.compute.sim.matrix
returns a $m \times m$ symmetric matrix of similarity values. $m$ is the length of the input list.

Details

See rs.compute functions, for details for fingerprints and similarity matrices. ms.compute can use fingerprint caching by enabling fpCached option. ms.compute and ms.compute.sim.matrix use same cache as rs.compute and other functions in the package.

See Also

rs.compute, rs.clearCache

Examples

Run this code
ms.compute('N', '[H]N([H])[H]', standardize = FALSE)

Run the code above in your browser using DataLab