evtree (version 1.0-8)

BBBClub: Bookbinder's Book Club

Description

Marketing case study about a (fictitious) American book club to whose customers a book about “The Art History of Florence” was advertised.

Usage

data("BBBClub")

Arguments

Format

A data frame containing 1,300 observations on 11 variables.

choice

factor. Did the customer buy the advertised book?

gender

factor indicating gender.

amount

total amount of money spent at the BBB Club.

freq

number of books purchased at the BBB Club.

last

number of months since the last purchase.

first

number of months since the first purchase.

child

number of children's books purchased.

youth

number of youth books purchased.

cook

number of cookbooks purchased.

diy

number of do-it-yourself books purchased.

art

number of art books purchased.

Details

The data is a marketing case study about a (fictitious) American book club. taken from the Marketing Engineering textbook of Lilien and Rangaswamy (2004). In this case study, a brochure of the book “The Art History of Florence” was sent to 20,000 customers and 1,806 of which bought the book. A subsample of 1,300 customers is provided in BBBClub for building a predictive model for choice.

The use of a cost matrix is suggested for this dataset. Classifying a customer that purchased the book as a non-buyer is worse (cost = 5), than it is to classify a custumer that did not purchase the book as a buyer (cost = 1).

References

Lilien GL, Rangaswamy A (2004). Marketing Engineering: Computer-Assisted Marketing Analysis and Planning, 2nd edition. Victoria, BC: Trafford Publishing.

Examples

Run this code
# NOT RUN {
## data, packages, random seed
data("BBBClub", package = "evtree")
library("rpart")
suppressWarnings(RNGversion("3.5.0"))
set.seed(1090)

## learn trees
ev <- evtree(choice ~ ., data = BBBClub, minbucket = 10, maxdepth = 2)
rp <- as.party(rpart(choice ~ ., data = BBBClub, minbucket = 10, model = TRUE))
ct <- ctree(choice ~ ., data = BBBClub, minbucket = 10, mincrit = 0.99)

## visualization
plot(ev)
plot(rp)
plot(ct)

## accuracy: misclassification rate
mc <- function(obj) 1 - mean(predict(obj) == BBBClub$choice)
c("evtree" = mc(ev), "rpart" = mc(rp), "ctree" = mc(ct))

## complexity: number of terminal nodes
c("evtree" = width(ev), "rpart" = width(rp), "ctree" = width(ct))

## compare structure of predictions
ftable(tab <- table(evtree = predict(ev), rpart  = predict(rp),
  ctree  = predict(ct), observed = BBBClub$choice))

## compare customer predictions only (absolute, proportion correct)
sapply(c("evtree", "rpart", "ctree"), function(nam) {
  mt <- margin.table(tab, c(match(nam, names(dimnames(tab))), 4))
  c(abs = as.vector(rowSums(mt))[2],
    rel = round(100 * prop.table(mt, 1)[2, 2], digits = 3))
})
# }

Run the code above in your browser using DataCamp Workspace