Usage
extract.tweets(set, string = NULL, size = 0, fields = c("created_at", "user.screen_name", "text"), retweets = NULL, hashtags = NULL, from = NULL, to = NULL, user_id = NULL, screen_name = NULL, verbose = TRUE)
Arguments
set
string, name of the collection of tweets in the Mongo database to query.
string
string or vector of strings, set to NULL by default (will
return all tweets). If it is a string, it will return tweets that contain
that string. If it is a vector of string, it will
return all tweets that contain at least one of them.
size
numeric, set to 0 by default (will return all tweets that match
other conditions). If it between 0 and 1 (not included), it will return that
proportion of tweets in the database (e.g. 0.5 implies 50% of all tweets that
match other conditions will be returned). If it is 1 or greater, it will return
a random sample of that size with tweets that match the specified conditions.
fields
vector of strings, indicates fields from tweets that will be
returned. Default is the date and time of the tweet, its text, and the screen
name of the user that published it. See details for full list of possible fields.
retweets
logical, set to NULL by default (will return all tweets).
If TRUE
, will return only tweets that are retweets (i.e. contain an embededed
retweeted status - manual retweets are not included). If FALSE
, will return
only tweets that are not retweets (manual retweets are now included).
hashtags
logical, set to NULL by default (will return all tweets).
If TRUE
, will return only tweets that use a hashtag. If FALSE
, will
return only tweets that do not contain a hashtag.
from
date, in string format. If different from NULL
, will
consider only tweets after that date. Note that using this field requires that
the tweets have a field in ISODate format called timestamp
. All times are GMT.
to
date, in string format. If different from NULL
, will
consider only tweets after that date. Note that using this field requires that
the tweets have a field in ISODate format called timestamp
. All times are GMT.
user_id
vector of numeric IDs for users. If different form NULL
, will return
only tweets sent by that set of Twitter users (if there are any in the collection)
screen_name
screen name of a user. If different form NULL
, will return
only tweets sent by that Twitter user (if there are any in the collection)
verbose
logical, default is TRUE
, which generates some output to the
R console with information about the count of tweets.