
Last chance! 50% off unlimited learning
Sale ends in
The Stopwords ISO Dataset is the most comprehensive collection of stopwords for multiple languages. The collection follows the ISO 639-1 language code.
stopwords
A list of character vectors that represent stopwords:
Afrikaans
Arabic
Armenian
Basque
Bengali
Breton
Bulgarian
Catalan
Chinese
Croatian
Czech
Danish
Dutch
English
Esperanto
Estonian
Finnish
French
Galician
German
Greek
Hausa
Hebrew
Hindi
Hungarian
Indonesian
Irish
Italian
Japanese
Korean
Kurdish
Latin
Lithuanian
Latvian
Malay
Marathi
Norwegian
Persian
Polish
Portuguese
Romanian
Russian
Slovak
Slovenian
Somali
Southern Sotho
Spanish
Swahili
Swedish
Thai
Tagalog
Turkish
Ukrainian
Urdu
Vietnamese
Yoruba
Zulu
# NOT RUN {
stopwords$en
# [1] "'ll" "'tis" "'twas" ...
stopwords$de
# [1] "a" "ab" "aber" "ach" ...
# }
Run the code above in your browser using DataLab