Learn R Programming

MSclassifR (version 0.4.0)

d_left_join: Function joining two tables based not on exact matches

Description

This function joins two tables based on a distance metric of one or more columns. It gives the same results that the distance_left_join function of the fuzzyjoin R package, but its execution time is faster. It is used to match mass spectra to short-listed mass-to-charge values in PredictLogReg and PredictFastClass functions.

Usage

d_left_join (x, y, by = NULL, method = "euclidean", max_dist = 1,
             distance_col = NULL)

Value

Returns a data frame that contains the results of joining the x dataset onto the y datasets, where all rows from x are preserved and matching data from y is added according to the specified criteria.

Arguments

x

A data.frame object.

y

A data.frame object.

by

Columns by which to join the two tables. NULL by default (tables are joined by matching the common column names that appear in both datasets x and y).

method

Method to use for computing distance, either "euclidean" (default) or "manhattan".

max_dist

A numeric value indicating the maximum distance to use for joining. 1 by default.

distance_col

A character that specifies the name of a new column to be added to the output, which will contain the calculated distances between both tables. If NULL, no column is added (default).