iqspr (version 2.3)

get_descriptor: get a descriptor (fingerprints and/or CDK physical descriptors) from SMILES strings

Description

Get a descriptor (fingerprints and/or CDK physical descriptors) from SMILES strings with possbility to request the scaling (for continuous descriptors, e.g. physical) or re-casting (for binary descriptors, e.g. fingerprints) of the output descriptors.

Usage

get_descriptor(smis = c("C1=CC=C(C=C1)O"), desctypes = c("standard"),
  scale = F, scale_init = F, mdesc = 0, sddesc = 1, quiet = F)

Arguments

smis

is a SMILES strings vector ("C1=CC=C(C=C1)O", canonical SMILES of a phenol by default).

desctypes

is a vector of characters defining the fingerprints and/or physical descriptors types to compute ("standard" by default). The actual entire list of available fingerprints: "standard", "extended", "graph", "hybridization", "maccs", "estate", "pubchem", "kr", "shortestpath" and "circular", and physical descriptors: "constitutional","topological","electronic" can be computed.

scale

sets to TRUE (FALSE by default) for scaling the physical descriptors only (i.e. continuous features) - mean = 0, s.d. = 1.

scale_init

sets to TRUE (FALSE by default) to keep in memory the means and s.d. related to each descriptor after a first scaling. Indeed, after the descriptors on a training set have been first computed, the mean and s.d. have to be kept fixed for future descriptors computation on test and/or validation sets. In this last case, the scale_init variable is set to FALSE.

mdesc

is a scalar (0 by default) or vector of means for a post-scaling of physical descriptors.

sddesc

is a scalar (1 by default) or vector of standard deviations for a post-scaling of physical descriptors.

quiet

keeps the console's outputs quiet if sets to TRUE (FALSE by default).

Value

the descriptor(s) with the associated means and standard deviations for scaling.

Examples

Run this code
# NOT RUN {
descriptors <- get_descriptor(smis = "C1=CC=C(C=C1)O", desctypes = c("standard","topological"))

# }
# NOT RUN {
# }

Run the code above in your browser using DataCamp Workspace