This function joins two tables based on a distance metric of one or more columns. It gives the same results that the distance_left_join function of the fuzzyjoin R package, but its execution time is faster. It is used to match mass spectra to short-listed mass-to-charge values in PredictLogReg and PredictFastClass functions.
d_left_join (x, y, by = NULL, method = "euclidean", max_dist = 1,
distance_col = NULL)Returns a data frame that contains the results of joining the x dataset onto the y datasets, where all rows from x are preserved and matching data from y is added according to the specified criteria.
A data.frame object.
A data.frame object.
Columns by which to join the two tables. NULL by default (tables are joined by matching the common column names that appear in both datasets x and y).
Method to use for computing distance, either "euclidean" (default) or "manhattan".
A numeric value indicating the maximum distance to use for joining. 1 by default.
A character that specifies the name of a new column to be added to the output, which will contain the calculated distances between both tables. If NULL, no column is added (default).