hyphen(words, ...)## S3 method for class 'kRp.taggedText':
hyphen(words, hyph.pattern = NULL,
min.length = 4, rm.hyph = TRUE, corp.rm.class = "nonpunct",
corp.rm.tag = c(), quiet = FALSE, cache = TRUE)
## S3 method for class 'character':
hyphen(words, hyph.pattern = NULL, min.length = 4,
rm.hyph = TRUE, corp.rm.class = "nonpunct", corp.rm.tag = c(),
quiet = FALSE, cache = TRUE)
kRp.tagged-class,
kRp.txt.freq-class or
kRp.hyph.pat-class, or
a valid character string naming the language of the patterns to be used. See details.hyphen will
not split words after the first or before the last letter,
so values smaller than 4 are not useful."nonpunct" has special meaning and will cause the result of
kRp.POS.tags(lang, c("punct","sentc"),
list.classes=TRUE) to be used. Relevant onlywords
is a valid koRpus object.FALSE, short status messages will be shown.hyphen() can cache results to speed up the process. If this option is set to TRUE,
the
current cache will be queried and new tokens also be added. Caches are language-specific and reside in an environment,
i.e., thkRp.hyphen-classhyph.XXwords is already a tagged object,
its language definition might be used. Otherwise, in addition to the words to
be processed you must specify hyph.pattern. You have two options: If you
want to use one of the built-in language patterns, just set it to the according
language abbrevation. As of this version valid choices are:
"de""de.old""en""en.us""es""fr""it""ru"[1]
[2]
read.hyph.pat,
manage.hyph.pat