This function calculates the similarity between documents using TF-IDF weighting and cosine similarity.
calc_doc_sim(
text_data,
text_column = "abstract",
min_term_freq = 2,
max_doc_freq = 0.9
)A similarity matrix for the documents.
A data frame containing text data.
Name of the column containing text to analyze.
Minimum frequency for a term to be included.
Maximum document frequency (as a proportion) for a term to be included.