rtweet (version 0.2.0)

clean_tweets: clean_tweets

Description

Returns list of cleaned words for each observation. Useful in analysis of tweet text.

Usage

clean_tweets(tweets, min = 0, stopwords = NULL, exclude_words = NULL)

Arguments

tweets
Character vector of tweets text. May also provide data frame or list object with "text" named object containing tweet text.
min
Numeric, minimum number of ocurrences to include in returned object. By default, min = 0, all words (except those excluded by stopwords and exclude_words) are returned. To only return words mentioned at least 3 times, for example, set (min = 3).
stopwords
Character, words to exclude. By default, stopwords = NULL, uses a generic list of stopwords.
exclude_words
Character, other words to exclude in addition to generic search terms.

Value

list object top words

Examples

Run this code
## Not run: 
# # search for 1000 tweets mentioning Hillary Clinton
# hrc <- search_tweets(q = "hillaryclinton", count = 1000)
# 
# # lookup returned user_id values
# users <- lookup_users(hrc$user_id)
# users
# 
# # merge data objects
# dat <- dplyr::left_join(hrc, users, by = "user_id")
# dat
# 
# # clean tweet text for each user
# dat$words <- clean_tweets(dat, exclude_words = "hillaryclinton")
# dat$words
# ## End(Not run)

Run the code above in your browser using DataCamp Workspace