Learn R Programming

inpdfr (version 0.1.5)

postProcTxt: Prossess vectors containing words into a data.frame of word occurrences.

Description

Prossess vectors containing words into a data.frame of word occurrences.

Usage

postProcTxt(txt, minword = 1, maxword = 20, minFreqWord = 1)

Arguments

txt

A vector containing text.

minword

An integer specifying the minimum number of letters per word into the returned data.frame.

maxword

An integer to specifying the maximum number of letters per word into the returned data.frame.

minFreqWord

An integer specifying the minimum word frequency into the returned data.frame.

Value

A data.frame (freq = occurrences, stem = stem words, word = words), sorted by word occurrences.

Examples

Run this code
# NOT RUN {
postProcTxt(txt = preProcTxt(filetxt = "loremIpsum.txt"))
# }
# NOT RUN {
data("loremIpsum")
subDir <- "RESULTS"
dir.create(file.path(getwd(), subDir), showWarnings = FALSE)
write(x = loremIpsum, file = "RESULTS/loremIpsum.txt")
preProcTxt(filetxt = paste0(getwd(), "/RESULTS/loremIpsum.txt"))
postProcTxt(txt = preProcTxt(filetxt = paste0(getwd(), "/RESULTS/loremIpsum.txt")))
file.remove(list.files(pattern = "loremIpsum"))
# }

Run the code above in your browser using DataLab