hyphen(words, ...)
"hyphen"(words, hyph.pattern = NULL, min.length = 4, rm.hyph = TRUE, corp.rm.class = "nonpunct", corp.rm.tag = c(), quiet = FALSE, cache = TRUE)
"hyphen"(words, hyph.pattern = NULL, min.length = 4, rm.hyph = TRUE, corp.rm.class = "nonpunct", corp.rm.tag = c(), quiet = FALSE, cache = TRUE)kRp.tagged-class,
kRp.txt.freq-class or
kRp.analysis-class,
or a character vector with words to be hyphenated.kRp.hyph.pat-class, or
a valid character string naming the language of the patterns to be used. See details.hyphen will
not split words after the first or before the last letter,
so values smaller than 4 are not useful."nonpunct" has special meaning and will cause the result of
kRp.POS.tags(lang, c("punct","sentc"),
list.classes=TRUE) to be used. Relevant only if words
is a valid koRpus object.words
is a valid koRpus object.FALSE, short status messages will be shown.hyphen() can cache results to speed up the process. If this option is set to TRUE,
the
current cache will be queried and new tokens also be added. Caches are language-specific and reside in an environment,
i.e., they are cleaned at the end of a session. If you want to save these for later use,
see the option hyphen.cache.file
in set.kRp.env.kRp.hyphen-class
words is already a tagged object,
its language definition might be used. Otherwise, in addition to the words to
be processed you must specify hyph.pattern. You have two options: If you
want to use one of the built-in language patterns, just set it to the according
language abbrevation. As of this version valid choices are:
"de" --- German (new spelling, since 1996)
"de.old" --- German (old spelling, 1901--1996)
"en" --- English (UK)
"en.us" --- English (US)
"es" --- Spanish
"fr" --- French
"it" --- Italian
"ru" --- Russian
In case you'd rather use your own pattern set, hyph.pattern can be an
object of class kRp.hyph.pat, alternatively.
The built-in hyphenation patterns were derived from the patterns available on CTAN[1]
under the terms of the LaTeX Project Public License[2],
see hyph.XX
for detailed information.
[1] http://tug.ctan.org/tex-archive/language/hyph-utf8/tex/generic/hyph-utf8/patterns/
[2] http://www.ctan.org/tex-archive/macros/latex/base/lppl.txt
read.hyph.pat,
manage.hyph.pat
## Not run:
# hyphen(tagged.text)
# ## End(Not run)
Run the code above in your browser using DataLab