extract_profanity_terms: Extract Profanity Words

Description

Extract the profanity words from a text.

Usage

extract_profanity_terms(text.var,
  profanity_list = lexicon::profanity_alvarez, ...)

Arguments

text.var

The text variable. Can be a get_sentences object or a raw character vector though get_sentences is preferred as it avoids the repeated cost of doing sentence boundary disambiguation every time profanity is run.

profanity_list

A atomic character vector of profane words. The lexicon package has lists that can be used, including:

lexicon::profanity_alvarez
lexicon::profanity_arr_bad
lexicon::profanity_banned
lexicon::profanity_zac_anger

…

Ignored.

Value

Returns a data.table with a columns of profane terms.

Examples

Run this code

# NOT RUN {
bw <- sample(lexicon::profanity_alvarez, 4)
mytext <- c(
   sprintf('do you like this %s?  It is %s. But I hate really bad dogs', bw[1], bw[2]),
   'I am the best friend.',
   NA,
   sprintf('I %s hate this %s', bw[3], bw[4]),
   "Do you really like it?  I'm not happy"
)


x <- get_sentences(mytext)
profanity(x)

prof_words <- extract_profanity_terms(x)
prof_words
prof_words$sentence
prof_words$neutral
prof_words$profanity
data.table::as.data.table(prof_words)

attributes(extract_profanity_terms(x))$counts
attributes(extract_profanity_terms(x))$elements


brady <- get_sentences(crowdflower_deflategate)
brady_swears <- extract_profanity_terms(brady)

attributes(extract_profanity_terms(brady))$counts
attributes(extract_profanity_terms(brady))$elements
# }

Run the code above in your browser using DataLab