streamR (version 0.2.1)

parseTweets: Converts tweets in JSON format to data frame.

Description

This function parses tweets downloaded using filterStream, sampleStream or userStream and returns a data frame.

Usage

parseTweets(tweets, simplify = FALSE, verbose = TRUE)

Arguments

tweets

A character string naming the file where tweets are stored or the name of the object in memory where the tweets were saved as strings.

simplify

If TRUE it will return a data frame with only tweet and user fields (i.e., no geographic information or url entities).

verbose

logical, default is TRUE, which will print in the console the number of tweets that have been parsed.

Details

parseTweets parses tweets downloaded using the filterStream, sampleStream or userStream functions and returns a data frame where each row corresponds to one tweet and each column represents a different field for each tweet (id, text, created_at, etc.).

The total number of tweets that are parsed might be lower than the number of lines in the file or object that contains the tweets because blank lines, deletion notices, and incomplete tweets are ignored.

To parse json to a twitter list, see readTweets. That function can be significantly faster for large files, when only a few fields are required.

See Also

filterStream, sampleStream, userStream

Examples

Run this code
# NOT RUN {
## The dataset example_tweets contains 10 public statuses published
## by @twitterapi in plain text format. The code below converts the object
## into a data frame that can be manipulated by other functions.

data(example_tweets)
tweets.df <- parseTweets(example_tweets, simplify=TRUE)

# }
# NOT RUN {
## A more complete example, that shows how to capture a user's home timeline
## for one hour using authentication via OAuth, and then parsing the tweets
## into a data frame.

 library(ROAuth)
 reqURL <- "https://api.twitter.com/oauth/request_token"
 accessURL <- "http://api.twitter.com/oauth/access_token"
 authURL <- "http://api.twitter.com/oauth/authorize"
 consumerKey <- "xxxxxyyyyyzzzzzz"
 consumerSecret <- "xxxxxxyyyyyzzzzzzz111111222222"
 my_oauth <- OAuthFactory$new(consumerKey=consumerKey,
                              consumerSecret=consumerSecret,
                              requestURL=reqURL,
                              accessURL=accessURL,
                              authURL=authURL)
 my_oauth$handshake()
 userStream( file="my_timeline.json", with="followings",
         timeout=3600, oauth=my_oauth )
 tweets.df <- parseTweets("my_timeline.json")
# }

Run the code above in your browser using DataCamp Workspace