r
reference compounds in d
dimensions.eiMakeDb(refs,d,descriptorType="ap",distance=getDefaultDist(descriptorType),
dir=".",numSamples=getGroupSize(conn,
name = file.path(dir,Main)) * 0.1,conn=defaultConn(dir),
cl=makeCluster(1,type="SOCK",outfile=""),connSource=NULL)
Refs
can be one of three things. It can be a filename of an iddb file
giving the index values of the reference compounds to use, it can be vector of
index values, or it can be a scalar value giving the number of randomly selected
references to use.This function can also be used to setup the envrionment on the cluster worker nodes. For example, you might need to re-load libraries like RSQLite and such.
dir
("run-r-d" by default).
The return value is an id number called the runId
, which needs to be
given to other functions such as eiQuery or eiAdd.eiMakeDb
will pick
numSamples
non-reference samples which can later be used by the
eiPerformanceTest
function. eiMakdDb
does its job in a job folder, named after the number of reference
compounds and the number of embedding dimensions. For example, using 300
reference compounds to generate a 100-dimensional embedding (r=300,
d=100) will result in a job folder called run-300-100.
The embedding result is the file matrix.
eiInit
eiPerformanceTest
eiQuery
eiCluster
library(snow)
r<- 50
d<- 40
#initialize
data(sdfsample)
dir=file.path(tempdir(),"makedb")
dir.create(dir)
eiInit(sdfsample,dir=dir)
#create compound db
runId=eiMakeDb(r,d,numSamples=20,dir=dir,
cl=makeCluster(1,type="SOCK",outfile=""))
Run the code above in your browser using DataLab