ft_lsh_utils

ml_approx_nearest_neighbors

ml_approx_similarity_join

A fitted LSH model, returned by either <code>ft_minhash_lsh()</code>
or <code>ft_bucketed_random_projection_lsh()</code>.

model

The dataset to search for nearest neighbors of the key.

dataset

Feature vector representing the item to search for.

The maximum number of nearest neighbors.

num_nearest_neighbors

Output column for storing the distance between each result row and the key.

dist_col

dataset_a

dataset_b

The threshold for the distance of row pairs.

threshold

R interface to Apache Spark, a fast and general engine for big data
processing, see <http://spark.apache.org>. This package supports connecting to
local and remote Apache Spark clusters, provides a 'dplyr' compatible back-end,
and provides an interface to Spark's built-in machine learning algorithms.

Yitao Li

sparklyr

R Interface to Apache Spark

Javier Luraschi

Kevin Kuo

Kevin Ushey

JJ Allaire

Samuel Macedo

Hossein Falaki

Lu Wang

Andy Zhang

Jozef Hajnala

Maciej Szymkiewicz

Wil Davis

 RStudio

 The Apache Software Foundation

ft_lsh_utils function

Utility functions for LSH models — ft_lsh_utils

Utility functions for LSH models

ft_lsh_utils: Utility functions for LSH models

Description

Usage

Arguments