eiQuery(runId,queries,format="sdf",
dir=".",distance=getDefaultDist(descriptorType),conn=defaultConn(dir),
asSimilarity=FALSE, K=200, W = 1.39564, M=19,L=10,T=30,lshData=NULL,
mainIds = readIddb(conn,file.path(dir, Main),sorted=TRUE))eiMakeDb. If your coming from an older version of eiR, you should
not use this value instead of
specifying r, d,refIddb, and descriptorType.queries is either a filename of an sdf file, or and SDFset object;
"compound_id" when queries is a list of id numbers; and "name", when queries
is a list of compound names, as returned by cid(apset).loadLSHData. The LSH data is generally the largest
chunk of data that must be held in memory while performing a query. Since it
remains the same across queries it makes sense to pre-load the is data once when
doing multiple queries. If this value is NULL the LSH data will be loaded internally and then released before
eiQuery returns.
asSimilarity is true then instead of a "distance"
column there will be a "similarity" column.r, d, and
refIddb parameters. The queries can be given in a few
different formats, see the queries parameter for details.
The LSH algorithm is used to quickly identify compounds similar to the
queries.
This function must use a distance function rather than a similarity function.
However, if the distance function given returns values between 0 and 1, then
the asSimilarity parameter may be used to return similarity values rather
than distance values.eiInit
eiMakeDb
eiPerformanceTestlibrary(snow)
r<- 50
d<- 40
#initialize
data(sdfsample)
dir=file.path(tempdir(),"query")
dir.create(dir)
eiInit(sdfsample,dir=dir)
#create compound db
runId=eiMakeDb(r,d,numSamples=20,dir=dir,
cl=makeCluster(1,type="SOCK",outfile=""))
#find compounds similar two each query
results = eiQuery(runId,sdfsample[1:2],K=15,dir=dir)Run the code above in your browser using DataLab