read.tagged(file, lang = "kRp.env", encoding = NULL,
tagger = "TreeTagger", apply.sentc.end = TRUE, sentc.end = c(".", "!",
"?", ";", ":"), stopwords = NULL, stemmer = NULL, rm.sgml = TRUE)
kRp.POS.tags
for all supported languages.
If set to "kRp.env"
this is got from
"Latin1"
or "UTF-8"
. If NULL
,
the encoding will either be taken from a preset (if defined in TT.options
),
or fall bsentc.end
should be searched and set to a sentence ending tag. You could
call this a compatibility mode to make sure you get the results you would get if you called
stopwords=tm::stopwords("en")
to use the english stopwords provided by the tm
package.stemmer=Snowball::SnowballStemmer
if you have
the Snowball
package installed (or SnowballC::wordStem
). As of now,
you cannot provide furkRp.tagged-class
. If debug=TRUE
,
prints internal variable settings and attempts to return the
original output if the TreeTagger system call in a matrix.lang
must match a valid language supported by kRp.POS.tags
.
It will also get stored in the resulting object and might be used by other functions at a later point.[1]
treetag
, freq.analysis
,
get.kRp.env
,
kRp.tagged-class
tagged.results <- read.tagged("~/my.data/tagged_speech.txt", lang="en")
Run the code above in your browser using DataLab