bind_clinspacy: This function binds columns containing either the lemma of the entity or the UMLS concept unique identifier (CUI) with frequencies to a data frame. The resulting data frame can be used to train a machine learning model or for additional feature selection.

Description

This function binds columns containing either the lemma of the entity or the UMLS concept unique identifier (CUI) with frequencies to a data frame. The resulting data frame can be used to train a machine learning model or for additional feature selection.

Usage

bind_clinspacy(
  clinspacy_output,
  df,
  cs_col = NULL,
  df_id = NULL,
  subset = "is_negated == FALSE"
)

Arguments

clinspacy_output

A data.frame or file name containing the output from clinspacy.

The data.frame to which you would like to bind the output of clinspacy.

cs_col

Name of the column in the clinspacy_output that you would like to pivot. For example: "entity", "lemma", "cui", or "definition". Defaults to "lemma" if use_linker is set to FALSE and "cui" if use_linker is set to TRUE.

df_id

The name of the id column in the data frame with which the clinspacy_id column in clinspacy_output will be joined. If you supplied a df_id in clinspacy, then you must also supply it here. If you did not supply it in clinspacy, then it will default to the row number (similar behavior to in clinspacy).

subset

Logical criteria represented as a string by which the clinspacy_output will be subsetted prior to building the output data frame. Defaults to "is_negated == FALSE", which removes negated concepts prior to generating the output. Any column in clinspacy_output may be referenced here. To avoid any subsetting, set this to NULL.

Value

A data frame containing the original data frame as well as additional column names for each lemma or UMLS concept unique identifer found with values containing frequencies.

Examples

Run this code

# NOT RUN {
mtsamples <- dataset_mtsamples()
mtsamples[1:5,] %>%
  clinspacy(df_col = 'description') %>%
  bind_clinspacy(mtsamples[1:5,])
# }

Run the code above in your browser using DataLab