Learn R Programming

psychtm (version 2021.1.0)

prep_docs: Prepare documents in a data frame for modeling

Description

prep_docs() takes documents stored as a column of a data frame and converts them into a list containing a matrix representation of documents and vocabulary character vector for modeling.

Usage

prep_docs(data, col, lower = TRUE)

Arguments

data

A data frame containing a column of documents.

col

A character string denoting the column of documents in data.

lower

Should all terms be converted to lowercase? (default: TRUE).

Value

A list with two components: documents A matrix of term uses with one row per document and one column per term position up to the number of terms in the longest document; vocab A character vector of unique terms in the documents.

Examples

Run this code
# NOT RUN {
data(teacher_rate)  # Synthetic student ratings of instructors
docs_vocab <- prep_docs(teacher_rate, "doc")
str(docs_vocab) # A list with two components `documents` and `vocab`

# }

Run the code above in your browser using DataLab