scorecard

This R package makes the development of credit risk scorecard easily and efficiently by providing functions as follows:

information value (iv),
variable filter (var_filter),
optimal woe binning (woebin, woebin_ply, woebin_plot),
scorecard scaling (scorecard, scorecard_ply)
and performace evaluation (perf_plot, perf_psi).

Installation

Install the release version of scorecard from CRAN with:

install.packages("scorecard")

Install the latest version of scorecard from github with:

# install.packages("devtools")
devtools::install_github("shichenxie/scorecard")

Example

This is a basic example which shows you how to develop a common credit risk scorecard:

# Traditional Credit Scoring Using Logistic Regression
library(data.table)
library(scorecard)

# load germancredit data
data("germancredit")

# rename creditability as y
dt <- setDT(germancredit)[, `:=`(
  y = ifelse(creditability == "bad", 1, 0),
  creditability = NULL
)]

# breaking dt into train and test ------
set.seed(125)
dt <- dt[sample(nrow(dt))]
# rowname of train
set.seed(345)
rn <- sample(nrow(dt), nrow(dt)*0.6)
# train and test dt
dt_train <- dt[rn]; dt_test <- dt[-rn];

# woe binning ------
bins <- woebin(dt_train, "y")
# woebin_plot(bins)

# converting train and test into woe values
train <- woebin_ply(dt_train, bins)
test <- woebin_ply(dt_test, bins)

# glm ------
m1 <- glm( y ~ ., family = "binomial", data = train)
# summary(m1)

# Select a formula-based model by AIC
m_step <- step(m1, direction="both")
m2 <- eval(m_step$call)
# summary(m2)

# performance ------
# predicted proability
train_pred <- predict(m2, type='response', train)
test_pred <- predict(m2, type='response', test)

# ks & roc plot
perf_plot(train$y, train_pred, title = "train")
perf_plot(train$y, train_pred, title = "test")

# score
card <- scorecard(bins, m2)

# credit score, only_total_score = TRUE
train_score <- scorecard_ply(dt_train, card)
test_score <- scorecard_ply(dt_test, card)

# psi
perf_psi(
  score = list(train = train_score, test = test_score),
  label = list(train = train[,"y"], test = test[, "y"]),
  x_limits = c(150, 750),
  x_tick_break = 50
  )


# Session info ------
#  setting  value                       
#  version  R version 3.4.1 (2017-06-30)
#  system   x86_64, darwin15.6.0        
#  ui       X11                         
#  language (EN)                        
#  collate  C                           
#  tz       Asia/Shanghai               
# 
# Packages ------
#  package    * version date       source        
#  base       * 3.4.1   2017-07-07 local         
#  colorspace   1.3-2   2016-12-14 CRAN (R 3.3.2)
#  compiler     3.4.1   2017-07-07 local         
#  data.table * 1.10.4  2017-02-01 CRAN (R 3.4.0)
#  datasets   * 3.4.1   2017-07-07 local         
#  devtools     1.13.3  2017-08-02 CRAN (R 3.4.1)
#  digest       0.6.12  2017-01-27 CRAN (R 3.3.2)
#  ggplot2      2.2.1   2016-12-30 CRAN (R 3.4.0)
#  graphics   * 3.4.1   2017-07-07 local         
#  grDevices  * 3.4.1   2017-07-07 local         
#  grid         3.4.1   2017-07-07 local         
#  gridExtra    2.3     2017-09-09 CRAN (R 3.4.1)
#  gtable       0.2.0   2016-02-26 CRAN (R 3.2.3)
#  lazyeval     0.2.0   2016-06-12 cran (@0.2.0) 
#  memoise      1.1.0   2017-04-21 CRAN (R 3.3.2)
#  methods    * 3.4.1   2017-07-07 local         
#  munsell      0.4.3   2016-02-13 CRAN (R 3.2.3)
#  plyr         1.8.4   2016-06-08 cran (@1.8.4) 
#  Rcpp         0.12.12 2017-07-15 CRAN (R 3.4.1)
#  rlang        0.1.2   2017-08-09 CRAN (R 3.4.1)
#  scales       0.5.0   2017-08-24 CRAN (R 3.4.1)
#  scorecard  * 0.1.0   2017-09-30 local         
#  stats      * 3.4.1   2017-07-07 local         
#  tibble       1.3.4   2017-08-22 CRAN (R 3.4.1)
#  utils      * 3.4.1   2017-07-07 local         
#  withr        2.0.0   2017-07-28 CRAN (R 3.4.1)

scorecard

Installation

Example

Copy Link

Version

Install

Monthly Downloads

Version

License

Issues

Pull Requests

Stars

Forks

Repository

Maintainer

Last Published

Functions in scorecard (0.1.2)