Learn R Programming

liver (version 1.29)

purchase_intention: Online Shopper Purchase Intention Data

Description

A dataset containing session-level information from an e-commerce website, including page visit counts, time spent in different page categories, Google Analytics metrics, visitor characteristics, and a binary outcome indicating whether the session ended in a purchase. The dataset can be used to illustrate binary classification, exploratory data analysis, model comparison, and supervised learning methods in R.

Usage

data(purchase_intention)

Arguments

Format

A data frame with 12330 observations and 18 variables:

administrative

Number of administrative pages visited during the session.

administrative_duration

Total time spent on administrative pages during the session.

informational

Number of informational pages visited during the session.

informational_duration

Total time spent on informational pages during the session.

product_related

Number of product-related pages visited during the session.

product_related_duration

Total time spent on product-related pages during the session.

bounce_rates

Average bounce rate associated with the visited pages.

exit_rates

Average exit rate associated with the visited pages.

page_values

Average page value for pages visited before a completed transaction.

special_day

Closeness of the session date to a special shopping day, scaled between 0 and 1.

month

Month of the session.

operating_systems

Visitor operating system, recorded as a categorical factor.

browser

Visitor browser, recorded as a categorical factor.

region

Visitor region, recorded as a categorical factor.

traffic_type

Traffic source type, recorded as a categorical factor.

visitor_type

Visitor type: "New_Visitor", "Returning_Visitor", or "Other".

weekend

Whether the session occurred on a weekend: "no" or "yes".

revenue

Whether the session ended in a purchase: "no" or "yes".

Details

This dataset was obtained from the UCI Machine Learning Repository and renamed purchase_intention for inclusion in the liver package. It contains session-level records from an online shopping website and is well suited for illustrating modern binary classification problems in which the goal is to predict whether a browsing session will end in a purchase.

The predictors combine behavioral measures such as page visit counts and time spent on different types of pages with summary metrics such as bounce_rates, exit_rates, and page_values, as well as visitor and session characteristics including month, visitor_type, traffic_type, and weekend. The outcome variable revenue indicates whether the session resulted in a completed transaction.

The dataset is particularly useful for demonstrating classification workflows such as partitioning data into training and test sets, fitting logistic regression, Naive Bayes, k-nearest neighbors, and tree-based models, and evaluating predictive performance using confusion matrices, ROC curves, and AUC.

References

Sakar, C. O., Polat, S. O., Katircioglu, M., and Kastro, Y. (2019). Real-time prediction of online shoppers' purchasing intention using multilayer perceptron and LSTM recurrent neural networks. Neural Computing and Applications, 31, 6893--6908. tools:::Rd_expr_doi("10.1007/s00521-018-3523-0")

Reza Mohammadi (2025). Data Science Foundations and Machine Learning with R: From Data to Decisions. https://book-data-science-r.netlify.app

See Also

mortgage, bank, churn_mlc, churn, churn_tel, adult, cereal, advertising, marketing, drug, house, house_price, red_wines, white_wines, insurance, caravan, loan

Examples

Run this code
data(purchase_intention)

str(purchase_intention)

summary(purchase_intention)

Run the code above in your browser using DataLab