Learn R Programming

quanteda (version 0.7.2-1)

removeFeatures: remove features from an object

Description

This function removes features from a variety of objects, such as text, a dfm, or a list of collocations. The most common usage for removeFeatures will be to eliminate stop words from a text or text-based object. Some commonly used built-in stop words can be accessed through stopwords.

Usage

removeFeatures(x, stopwords = NULL, verbose = TRUE, ...)

## S3 method for class 'character': removeFeatures(x, stopwords = NULL, verbose = TRUE, ...)

## S3 method for class 'dfm': removeFeatures(x, stopwords = NULL, verbose = TRUE, ...)

## S3 method for class 'collocations': removeFeatures(x, stopwords = NULL, verbose = TRUE, pos = c(1, 2, 3), ...)

stopwordsRemove(x, stopwords = NULL, verbose = TRUE)

Arguments

x
object from which stopwords will be removed
stopwords
character vector of features to remove. Now requires an explicit list to be supplied, for instance stopwords("english").
verbose
if TRUE print message about how many features were removed
...
additional arguments for some methods (such as pos for collocations)
pos
indexes of word position if called on collocations: remove if word pos is a stopword

Value

  • an object with stopwords removed

Details

Because we believe the user should take full responsibility for any features that are removed, we do not provide a default list. Use stopwords instead.

See Also

stopwords

Examples

Run this code
## examples for character objects
someText <- "Here's some text containing words we want to remove."
removeFeatures(someText, stopwords("english", verbose=FALSE))
removeFeatures(someText, stopwords("SMART", verbose=FALSE))
removeFeatures(someText, c("some", "want"))
itText <- "Ecco alcuni di testo contenente le parole che vogliamo rimuovere."
removeFeatures(itText, stopwords("italian", verbose=FALSE))

## example for dfm objects
mydfm <- dfm(ukimmigTexts, verbose=FALSE)
removeFeatures(mydfm, stopwords("english", verbose=FALSE))

## example for collocations
(myCollocs <- collocations(inaugTexts[1:3], top=20))
removeFeatures(myCollocs, stopwords("english", verbose=FALSE))

Run the code above in your browser using DataLab