Learn R Programming

liver (version 1.29)

creditcard_fraud: Credit Card Transactions for Fraud Detection

Description

A dataset containing credit card transactions for illustrating fraud detection and class imbalance in binary classification. The data include anonymized predictors derived from a principal component analysis, together with transaction time, transaction amount, and a binary fraud indicator.

Usage

data(creditcard_fraud)

Arguments

Format

A data frame with 10000 observations and 31 variables:

Time

Seconds elapsed between each transaction and the first transaction in the dataset.

V1

Anonymized predictor obtained from a PCA transformation of the original variables.

V2

Anonymized predictor obtained from a PCA transformation of the original variables.

V3

Anonymized predictor obtained from a PCA transformation of the original variables.

V4

Anonymized predictor obtained from a PCA transformation of the original variables.

V5

Anonymized predictor obtained from a PCA transformation of the original variables.

V6

Anonymized predictor obtained from a PCA transformation of the original variables.

V7

Anonymized predictor obtained from a PCA transformation of the original variables.

V8

Anonymized predictor obtained from a PCA transformation of the original variables.

V9

Anonymized predictor obtained from a PCA transformation of the original variables.

V10

Anonymized predictor obtained from a PCA transformation of the original variables.

V11

Anonymized predictor obtained from a PCA transformation of the original variables.

V12

Anonymized predictor obtained from a PCA transformation of the original variables.

V13

Anonymized predictor obtained from a PCA transformation of the original variables.

V14

Anonymized predictor obtained from a PCA transformation of the original variables.

V15

Anonymized predictor obtained from a PCA transformation of the original variables.

V16

Anonymized predictor obtained from a PCA transformation of the original variables.

V17

Anonymized predictor obtained from a PCA transformation of the original variables.

V18

Anonymized predictor obtained from a PCA transformation of the original variables.

V19

Anonymized predictor obtained from a PCA transformation of the original variables.

V20

Anonymized predictor obtained from a PCA transformation of the original variables.

V21

Anonymized predictor obtained from a PCA transformation of the original variables.

V22

Anonymized predictor obtained from a PCA transformation of the original variables.

V23

Anonymized predictor obtained from a PCA transformation of the original variables.

V24

Anonymized predictor obtained from a PCA transformation of the original variables.

V25

Anonymized predictor obtained from a PCA transformation of the original variables.

V26

Anonymized predictor obtained from a PCA transformation of the original variables.

V27

Anonymized predictor obtained from a PCA transformation of the original variables.

V28

Anonymized predictor obtained from a PCA transformation of the original variables.

Amount

Transaction amount.

Class

Fraud indicator: 0 for non-fraudulent transactions and 1 for fraudulent transactions.

Details

This dataset is a teaching subset derived from the original Credit Card Fraud Detection dataset available on Kaggle. The original dataset is highly imbalanced. For inclusion in the liver package, we created a smaller subset with 10000 observations that retains all fraud cases and a random sample of non-fraud cases. This version is intended for illustrating class imbalance, resampling strategies, and model evaluation in binary classification.

References

Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson, and Gianluca Bontempi (2015). Calibrating Probability with Undersampling for Unbalanced Classification. In 2015 IEEE Symposium Series on Computational Intelligence.

Reza Mohammadi (2025). Data Science Foundations and Machine Learning with R: From Data to Decisions. https://book-data-science-r.netlify.app.

See Also

mortgage, bank, churn_mlc, churn, churn_tel, adult, cereal, advertising, marketing, drug, house, house_price, red_wines, white_wines, insurance, caravan, loan

Examples

Run this code
data(creditcard_fraud)
str(creditcard_fraud)

table(creditcard_fraud$Class)

Run the code above in your browser using DataLab