Learn R Programming

textir (version 1.8-6)

we8there: On-Line Restaurant Reviews

Description

Counts for 2804 bigrams in 6175 restaurant reviews from the site www.we8there.com.

Arguments

Value

  • we8thereCountsA simple_triplet_matrix of phrase counts indexed by review-rows and bigram-columns.
  • we8thereRatingsA matrix containing the associated review ratings.

Details

The short user-submitted reviews are accompanied by a five-star rating on four specific aspects of restaurant quality - food, service, value, and atmosphere - as well as the overall experience. The reviews originally appear in Maua and Cozman (2009), and the parsing details behind these specific counts are in Taddy (2011).

References

Maua, D.D. and Cozman, F.G. (2009), Representing and classifying user reviews. In ENIA '09: VIII Enconro Nacional de Inteligencia Artificial, Brazil.

Taddy (2012), Multinomial Inverse Regression for Text Analysis. http://arxiv.org/abs/1012.2098

Taddy (2012), On Estimation and Selection for Topic Models. http://arxiv.org/abs/1109.4518

See Also

pls, mnlm, congress109

Examples

Run this code
data(we8there)

## use bins to estimate with counts collapsed across equal ratings 1...5
summary( fitwe8 <- mnlm(we8thereCounts, we8thereRatings$Overall, bins=5) )
plot(fitwe8, type="reduction", v=as.factor(we8thereRatings$Overall), col=c(2,2,2,3,3))

## Fit a topic model (use lower tol for true convergence)
tpx <- topics(we8thereCounts, K=10, tol=100)
plot(tpx, group=we8thereRatings$Overall>3, col=c(2,3), labels=c("Bad","Good"))
summary(tpx)

Run the code above in your browser using DataLab