eiPerformanceTest(runId,distance=getDefaultDist(descriptorType),
conn=defaultConn(dir),dir=".",K=200, W = 1.39564, M=19,L=10,T=30)
eiMakeDb
. If your coming from an older version of eiR, you should
not use this value instead of
specifying r
, d
, and descriptorType
.dir
/run-r-d.The comparison results are summarized in two types of files. The first type lists the recall for different k values, k being the number of numbers to retrieve. These files are named as ``recall-ratio-k''. For example, if the recall is 70 compound search - 70 of the 100 results are among the real top-100 compounds - then the value at line 100 is 0.7. Several relaxation ration are used, each generating a file in this form. For instance, recall.ratio-10 is the file listing the recalls when relaxation ratio is 10. The other file, recall.csv, lists recalls of different relaxation ratios in one file by limiting to selected k value. In this CSV file, the rows correspond to different relaxation ratios, and the columns are different k values. You will be able to pick an appropriate relaxation ratio for the k values you are interested in.
The second test measures the performance of the Locality Sensitive Hash (LSH).
The results for lsh-assisted search will be in
run-r-d/indexed.performance. It's a 1,000-line files of recall values. Each
line corresponds to one test query. LSH search performance is
highly sensitive to your LSH parameters (K, W, M, L, T). The
default parameters are listed in the man page for
eiPerformanceTest
. When you have your embedding result in
a matrix file, you should follow instruction on
eiInit
eiMakeDb
eiQuery
library(snow)
r<- 50
d<- 40
#initialize
data(sdfsample)
dir=file.path(tempdir(),"perf")
dir.create(dir)
eiInit(sdfsample,dir=dir)
#create compound db
runId = eiMakeDb(r,d,numSamples=20,dir=dir,
cl=makeCluster(1,type="SOCK",outfile=""))
eiPerformanceTest(runId,dir=dir,K=22)
Run the code above in your browser using DataLab