powered by
This is a wrapper function to take a 'raw' input data table with compound information, uniformize the SMILES
cleanDB(db.formatted, cl, silent = TRUE, blocksize, smitype = "Canonical")
Data table with columns 'compoundname, structure, baseformula, charge, description'
parallel::makeCluster object for multithreading
Suppress warnings?
How many compounds to process per 'block'? Higher number means bigger memory spikes, but faster processing time.
SMILES format, Default: 'Canonical'
Data table with SMILES in the correct format, and charge/formula re-generated from said SMILES if available.
clusterApply pbapply check_chemform rbindlist
clusterApply
pbapply
check_chemform
rbindlist
# NOT RUN { myDB = build.LMDB(tempdir()) # } # NOT RUN { cleanedDB = cleanDB(myDB$db, cl = 0, blocksize = 10) # }
Run the code above in your browser using DataLab