kind
argument and returns the stopword list as a character vector The default is
English.stopwords(kind = "english", verbose = FALSE)
english
, SMART
, danish
, french
,
hungarian
, norwegian
, russian
, swedish
,
FALSE
, suppress the annoying warning notestopwords("english")[1:5]
stopwords("italian")[1:5]
stopwords("arabic")[1:5]
# adding to the built-in stopword list
toks <- tokenize("The judge will sentence Mr. Adams to nine years in prison", removePunct = TRUE)
removeFeatures(toks, c(stopwords("english"), "will", "mr", "nine"))
Run the code above in your browser using DataLab