This function joins two tables based on a distance metric of one or more columns. It gives the same results that the distance_left_join
function of the fuzzyjoin
R package, but its execution time is faster. It is used to match mass spectra to short-listed mass-to-charge values in PredictLogReg
and PredictFastClass
functions.
d_left_join (x, y, by = NULL, method = "euclidean", max_dist = 1,
distance_col = NULL)
Returns a data frame that contains the results of joining the x
dataset onto the y
datasets, where all rows from x
are preserved and matching data from y
is added according to the specified criteria.
A data.frame
object.
A data.frame
object.
Columns by which to join the two tables. NULL
by default (tables are joined by matching the common column names that appear in both datasets x and y).
Method to use for computing distance, either "euclidean" (default) or "manhattan".
A numeric
value indicating the maximum distance to use for joining. 1 by default.
A character
that specifies the name of a new column to be added to the output, which will contain the calculated distances between both tables. If NULL, no column is added (default).