knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
boxoffice() function scrapes information about daily box office results of movies. It scrapes the webpages of either http://www.boxofficemojo.com or https://www.the-numbers.com/ for this information. The data it returns are the following:
- Movie name
- The studio that produced that movie
- The daily gross
- Daily percent change in gross
- Number of theaters it is playing in
- Average gross per theater (result of 4 / result of 5)
- How many days the movie has been playing
- The date of the data
In essence, it shows how well each movie performed on a given day.
movies <- boxoffice::boxoffice(date = as.Date("2015-10-31")) dim(movies) movies[1:5, ]
There are three parameters for
dates are simply an input dates (in Date format) that you want to get information on. In accepts either a single date or a vector of dates.
site indicates which site you want to scrape: the-numbers.com or boxofficemojo.com. The accepted inputs are "numbers" which is the default site or "mojo". Both sites are very similar and provide nearly identical results. All results are ordered in descending order by how much that movie made on that day. For example, the top selling movie of the day is the first value while the worst selling movie is the last value.
Here is the first 10 movie names for both sites. We will use the
top_n parameter to only return the top 10 selling movies.
mojo <- boxoffice::boxoffice(dates = as.Date("2015-10-31"), site = "mojo", top_n = 10) numbers <- boxoffice::boxoffice(dates = as.Date("2015-10-31"), site = "numbers", top_n = 10) cbind(mojo[, c(1,3)], numbers[, c(1,3)])
The results are close. Some movie name spellings and numbers are slightly different. In this case, the 10th ranking movie is also different between the sites. Situations like this are rare. When looking at more recent releases (e.g. within the last two weeks), there will be more differences. These differences will disappear (at least for the most part) as time goes on.