Learn R Programming

Rcpi (version 1.8.0)

searchDrug: Parallelized Drug Molecule Similarity Search by Molecular Fingerprints Similarity or Maximum Common Substructure Search

Description

Parallelized Drug Molecule Similarity Search by Molecular Fingerprints Similarity or Maximum Common Substructure Search

Usage

searchDrug(mol, moldb, cores = 2, method = c("fp", "mcs"), fptype = c("standard", "extended", "graph", "hybrid", "maccs", "estate", "pubchem", "kr", "shortestpath", "fp2", "fp3", "fp4", "obmaccs"), fpsim = c("tanimoto", "euclidean", "cosine", "dice", "hamming"), mcssim = c("tanimoto", "overlap"), ...)

Arguments

mol
The query molecule. The location of a sdf file containing one molecule.
moldb
The molecule database. The location of a sdf file containing all the molecules to be searched with.
cores
Integer. The number of CPU cores to use for parallel search, default is 2. Users could use the detectCores() function in the parallel package to see how many cores they could use.
method
'fp' or 'mcs'. Search by molecular fingerprints or by maximum common substructure searching.
fptype
The fingerprint type, only available when method = 'fp'. Rcpi supports 13 types of fingerprints, including 'standard', 'extended', 'graph', 'hybrid', 'maccs', 'estate', 'pubchem', 'kr', 'shortestpath', 'fp2', 'fp3', 'fp4', 'obmaccs'.
fpsim
Similarity measure type for fingerprint, only available when method = 'fp'. Including 'tanimoto', 'euclidean', 'cosine', 'dice' and 'hamming'. See calcDrugFPSim for details.
mcssim
Similarity measure type for maximum common substructure search, only available when method = 'mcs'. Including 'tanimoto' and 'overlap'.
...
Other possible parameter for maximum common substructure search, see calcDrugMCSSim for available options.

Value

Named numerical vector. With the decreasing similarity value of the molecules in the database.

Details

This function does compound similarity search derived by various molecular fingerprints with various similarity measures or derived by maximum common substructure search. This function runs for a query compound against a set of molecules.

Examples

Run this code

mol = system.file('compseq/DB00530.sdf', package = 'Rcpi')
# DrugBank ID DB00530: Erlotinib
moldb = system.file('compseq/tyrphostin.sdf', package = 'Rcpi')
# Database composed by searching 'tyrphostin' in PubChem and filtered by Lipinski's Rule of Five
searchDrug(mol, moldb, cores = 4, method = 'fp', fptype = 'maccs', fpsim = 'hamming')
searchDrug(mol, moldb, cores = 4, method = 'fp', fptype = 'fp2', fpsim = 'tanimoto')
searchDrug(mol, moldb, cores = 4, method = 'mcs', mcssim = 'tanimoto')

Run the code above in your browser using DataLab