Learn R Programming

arabicStemR (version 1.3)

removeStopWords: Remove Arabic stopwords.

Description

Defines a list of Arabic-language stopwords and removes them from a string.

Usage

removeStopWords(texts, defaultStopwordList=TRUE, customStopwordList=NULL)

Value

Returns a string with Arabic stopwords removed.

Arguments

texts

A string from which Arabic stopwords should be removed.

defaultStopwordList

If TRUE, use the default stopword list of words to be removed. If FALSE, do not use the default stopword list. Default is TRUE.

customStopwordList

Optional user-specified stopword list of words to be removed, supplied as a vector of strings in either Arabic UTF-8 or Latin characters following the stemmer's transliteration scheme (words without Arabic UTF-8 characters are processed with reverse.transliterate()). Default is NULL.

Author

Rich Nielsen

Examples

Run this code
## Create string with Arabic characters

x <- '\u0627\u0647\u0644\u0627 \u0648\u0633\u0647\u0644\u0627
 \u064a\u0627  \u0635\u062f\u064a\u0642\u064a'

## Remove stop words
removeStopWords(x)$text

## Not run
## To see the full list of stop words 
removeStopWords(x)$arabicStopwordList

Run the code above in your browser using DataLab