kotzias_reviews: Kotzias Reviews
Description
A dataset containing a list of 4 review data sets. Each data set contains
sentences with a postive (1) or negative review (-1) taken from reviews of
products, movies, & restaurants. The data, compiled by Kotzias, Denil, De Freitas,
& Smyth (2015), was originally taken from amazon.com, imdb.com, & yelp.com.
Kotzias et al. (2015) provide the following description in the README:
"For each website, there exist 500 positive and
500 negative sentences. Those were selected randomly for larger datasets of
reviews. We attempted to select sentences that have a clearly positive or
negative connotaton [sic], the goal was for no neutral sentences to be selected.
This data set has been manipulated from the original to be split apart by
element (sentence split). The original 0/1 metric has also been converted
to -1/1. Please cite Kotzias et al. (2015) if you reuse the data here.
Usage
data(kotzias_reviews)
Format
A list with 3 elementsDetails
Each data set contains a dataframe of:
- text. The sentences from the review.
- rating. A human scoring of the text.
- element_id. An index for the original text element (row number).
- sentence_id. A sentence number from 1-n within each
element_id
.