Learn R Programming

goldi (version 1.0.1)

match: Match terms

Description

Match terms in C++

Usage

match(term_vector, pdf_tdm, term_tdm, thresholds, pdf_index, terms, sentences)

Arguments

term_vector

Index vector of where each of the terms is in the pdf_tdm. i.e. the ith element of term_vector is j. Therefor, term i is at column j in the pdf_tdm.

pdf_tdm

Term document matrix of words in the PDF

term_tdm

Term document matrix of words in the terms and pdf sentences.

thresholds

Acceptance thresholds

pdf_index

Index of terms in PDF

terms

List of terms used, this is the vector of column names of term_tdm.

sentences

Vector of sentences read in from the PDf.

Value

List of matched terms.