Learn R Programming

nmslibR (version 1.0.1)

NMSlib: Non metric space library

Description

Non metric space library

Usage

# init <- NMSlib$new(input_data, Index_Params = NULL, Time_Params = NULL,
#                           space='l1', space_params = NULL, method = 'hnsw',
#                           data_type = 'DENSE_VECTOR', dtype = 'FLOAT',
#                           index_filepath = NULL, print_progress = FALSE)

Arguments

input_data

the input data. See details for more information

query_data_row

a vector to query for

query_data

the query_data parameter should be of the same type with the input_data parameter. Queries to query for

k

an integer. The number of neighbours to return

Index_Params

a list of (optional) parameters to use in indexing (when creating the index)

Time_Params

a list of parameters to use in querying. Setting Time_Params to NULL will reset

space

a character string (optional). The metric space to create for this index. Page 31 of the manual (see references) explains all available inputs

space_params

a list of (optional) parameters for configuring the space. See the references manual for more details.

method

a character string specifying the index method to use

data_type

a character string. One of 'DENSE_UINT8_VECTOR', 'DENSE_VECTOR', 'OBJECT_AS_STRING' or 'SPARSE_VECTOR'

dtype

a character string. One of 'DOUBLE', 'FLOAT', 'INT'

print_progress

a boolean (either TRUE or FALSE). Whether or not to display progress bar

num_threads

an integer. The number of threads to use

index_filepath

a character string specifying the path to a file, where an existing index is saved

filename

a character string specifying the path. The filename to save ( in case of the save_Index method ) or the filename to load ( in case of the load_Index method )

Format

An object of class R6ClassGenerator of length 24.

Methods

NMSlib$new(input_data, Index_Params = NULL, Time_Params = NULL, space='l1', space_params = NULL, method = 'hnsw', data_type = 'DENSE_VECTOR', dtype = 'FLOAT', index_filepath = NULL, print_progress = FALSE)

--------------

Knn_Query(query_data_row, k = 5)

--------------

knn_Query_Batch(query_data, k = 5, num_threads = 1)

--------------

save_Index(filename)

Details

input_data parameter : In case of numeric data the input_data parameter should be either an R matrix object or a scipy sparse matrix. Additionally, the input_data parameter can be a list including more than one matrices / sparse-matrices having the same number of columns ( this is ideal for instance if the user wants to include both a train and a test dataset in the created index )

the Knn_Query function finds the approximate K nearest neighbours of a vector in the index

the knn_Query_Batch Performs multiple queries on the index, distributing the work over a thread pool

the save_Index function saves the index to disk

If the index_filepath parameter is not NULL then an existing index will be loaded

References

https://github.com/searchivarius/nmslib/blob/master/manual/manual.pdf

Examples

Run this code
# NOT RUN {
if (reticulate::py_available() && reticulate::py_module_available("nmslib")) {

  library(nmslibR)

  set.seed(1)
  x = matrix(runif(1000), nrow = 100, ncol = 10)

  init_nms = NMSlib$new(input_data = x)


  # returns a 1-dimensional vector (index, distance)
  #--------------------------------------------------

  init_nms$Knn_Query(query_data_row = x[1, ], k = 5)


  # returns knn's for all data
  #---------------------------

  all_dat = init_nms$knn_Query_Batch(x, k = 5, num_threads = 1)

}
# }

Run the code above in your browser using DataLab