read.tagged(file, lang = "kRp.env", encoding = NULL, tagger = "TreeTagger", apply.sentc.end = TRUE, sentc.end = c(".", "!", "?", ";", ":"), stopwords = NULL, stemmer = NULL, rm.sgml = TRUE)
kRp.POS.tags
for all supported languages.
If set to "kRp.env"
this is got from get.kRp.env
."Latin1"
or "UTF-8"
.
If NULL
,
the encoding will either be taken from a preset (if defined in TT.options
), or fall back to ""
.
Hence you can overwrite the preset encoding with this parameter.sentc.end
should be searched and set to a sentence ending tag.
You could call this a compatibility mode to make sure you get the results you would get if you called
treetag
on the original file.
If set to FALSE
, the tags will be imported as they are.stopwords=tm::stopwords("en")
to use the english stopwords provided by the tm
package.stemmer=Snowball::SnowballStemmer
if you
have the Snowball
package installed (or SnowballC::wordStem
). As of now,
you cannot provide further arguments to
this function.kRp.tagged-class
. If debug=TRUE
,
prints internal variable settings and
attempts to return the original output if the TreeTagger system call in a matrix.
lang
must match a valid language supported by kRp.POS.tags
.
It will also get stored in the resulting object and might be used by other functions at a later point.
[1] http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html
treetag
,
freq.analysis
,
get.kRp.env
,
kRp.tagged-class
## Not run:
# tagged.results <- read.tagged("~/my.data/tagged_speech.txt", lang="en")
# ## End(Not run)
Run the code above in your browser using DataLab