stopwords
From tm v0.6-2
by Ingo Feinerer
Stopwords
Return various kinds of stopwords with support for different languages.
- Keywords
- file
Usage
stopwords(kind = "en")
Arguments
- kind
- A character string identifying the desired stopword list.
Details
Available stopword lists are:
catalan
- Catalan stopwords (obtained from http://latel.upf.edu/morgana/altres/pub/ca_stop.htm),
romanian
- Romanian stopwords (extracted from http://snowball.tartarus.org/otherapps/romanian/romanian1.tgz),
SMART
- English stopwords from the SMART information retrieval system (obtained from http://jmlr.csail.mit.edu/papers/volume5/lewis04a/a11-smart-stop-list/english.stop) (which coincides with the stopword list used by the MC toolkit (http://www.cs.utexas.edu/users/dml/software/mc/)),
and a set of stopword lists from the Snowball stemmer project in different
languages (obtained from
http://svn.tartarus.org/snowball/trunk/website/algorithms/*/stop.txt).
Supported languages are danish
, dutch
, english
,
finnish
, french
, german
, hungarian
, italian
,
norwegian
, portuguese
, russian
, spanish
, and
swedish
. Language names are case sensitive. Alternatively, their
IETF language tags may be used.
Value
-
is raised if no stopwords are available for the requested
kind
.Examples
stopwords("en")
stopwords("SMART")
stopwords("german")
Community examples
Looks like there are no examples yet.