qdap (version 2.3.2)

word_list: Raw Word Lists/Frequency Counts

Description

Transcript Apply Raw Word Lists and Frequency Counts by grouping variable(s).

Usage

word_list(text.var, grouping.var = NULL, stopwords = NULL,
  alphabetical = FALSE, cut.n = 20, cap = TRUE, cap.list = NULL,
  cap.I = TRUE, rm.bracket = TRUE, char.keep = NULL,
  apostrophe.remove = FALSE, ...)

Arguments

text.var

The text variable.

grouping.var

The grouping variables. Default NULL generates one word list for all text. Also takes a single grouping variable or a list of 1 or more grouping variables.

stopwords

A vector of stop words to remove.

alphabetical

If TRUE the output of frequency lists is ordered alphabetically. If FALSE the list is ordered by frequency rank.

cut.n

Cut off point for reduced frequency stop word list (rfswl).

cap

logical. If TRUE capitalizes words from the cap.list.

cap.list

Vector of words to capitalize.

cap.I

logical. If TRUE capitalizes words containing the personal pronoun I.

rm.bracket

logical If TRUE all brackets and bracketed text are removed from analysis.

char.keep

A character vector of symbols (i.e., punctuation) that word_list should keep. The default is to remove every symbol except apostrophes.

apostrophe.remove

logical. If TRUE removes apostrophes from the output.

Other arguments passed to strip.

Value

An object of class "word_list" is a list of lists of vectors or dataframes containing the following components:

cwl

complete word list; raw words

swl

stop word list; same as rwl with stop words removed

fwl

frequency word list; a data frame of words and corresponding frequency counts

fswl

frequency stopword word list; same as fwl but with stop words removed

rfswl

reduced frequency stopword word list; same as fswl but truncated to n rows

Examples

Run this code
# NOT RUN {
word_list(raj.act.1$dialogue)

out1 <- with(raj, word_list(text.var = dialogue, 
    grouping.var = list(person, act)))
names(out1)
lapply(out1$cwl, "[", 1:5)

with(DATA, word_list(state, person))
with(DATA, word_list(state, person, stopwords = Top25Words))
with(DATA, word_list(state, person, cap = FALSE, cap.list=c("do", "we")))
# }

Run the code above in your browser using DataCamp Workspace