run_sumup: Run Sum Up

Description

This function runs a series of text processing and analysis steps including text cleaning, tokenization, lemmatization, topic modeling, and sentiment analysis. It then classifies sentences into topics and generates an output summarizing the results.

Usage

run_sumup(dataset, settings = NULL)

Value

A list or JSON output (depending on settings) containing the processed text data classified by topics and sentiment, as well as various metrics related to topics, such as strength and feedback count.

Arguments

dataset: A data frame containing the text data to be analyzed. It should include at least the following columns: sentenceid, sentence, portfolioid, competencyid, feedbacktype, and datereferenced.
settings: A list containing settings for various processing steps. If not provided, default settings are used.

Details

This function performs the following steps:

Cleans the input text data using text_clean.
Tokenizes the text into sentences and removes stopwords.
Lemmatizes and annotates the sentences using a UDPipe model.
Counts word frequencies and excludes stopwords.
Performs topic modeling on the word counts.
Runs sentiment analysis based on the specified method (Grasp or SentimentR).
Classifies sentences into topics using the topic classification model.
Generates output summarizing the topics and sentiment.

Examples

Run this code

data(example_data)
ex_data <- example_data
ex_settings  <- set_default_settings()
ex_settings  <- update_setting(ex_settings , "language", "en")
ex_settings  <- update_setting(ex_settings , "use_sentiment_analysis", "sentimentr")
result <- run_sumup(ex_data, ex_settings )

Run the code above in your browser using DataLab