This method either takes a character vector or objects inheriting class kRp.text
(i.e., text tokenized by koRpus
),
and jumbles the words. This usually means that the
first and last letter of each word is left intact,
while all characters inbetween are being
randomized.
jumbleWords(words, ...)# S4 method for kRp.text
jumbleWords(words, min.length = 3, intact = c(start = 1, end = 1))
# S4 method for character
jumbleWords(words, min.length = 3, intact = c(start = 1, end = 1))
Either a character vector or an object inheriting from class kRp.text
.
Additional options, currently unused.
An integer value, defining the minimum word length. Words with less characters will not be changed. Grapheme clusters are counted as one.
A named vector with the two integer values named start
and stop
.
These define how many characters of each relevant words will be left unchanged at its start
and its end, respectively.
Depending on the class of words
, either a character vector or an object of class
kRp.text
with the added feature diff
.
# NOT RUN {
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
sample_file <- file.path(
path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
)
tokenized.obj <- tokenize(
txt=sample_file,
lang="en"
)
tokenized.obj <- jumbleWords(tokenized.obj)
pasteText(tokenized.obj)
# diff stats are now part of the object
hasFeature(tokenized.obj)
diffText(tokenized.obj)
} else {}
# }
Run the code above in your browser using DataLab