Last chance! 50% off unlimited learning
Sale ends in
A classification task for the German credit data set. The aim is to predict creditworthiness, labeled as "good" and "bad". Positive class is set to label "good".
See example for the creation of a MeasureClassifCosts as described misclassification costs.
R6::R6Class inheriting from TaskClassif.
This Task can be instantiated via the dictionary mlr_tasks or with the associated sugar function tsk()
:
mlr_tasks$get("german_credit")
tsk("german_credit")
Task type: “classif”
Dimensions: 1000x21
Properties: “twoclass”
Has Missings: FALSE
Target: “credit_risk”
Features: “age”, “amount”, “credit_history”, “duration”, “employment_duration”, “foreign_worker”, “housing”, “installment_rate”, “job”, “number_credits”, “other_debtors”, “other_installment_plans”, “people_liable”, “personal_status_sex”, “present_residence”, “property”, “purpose”, “savings”, “status”, “telephone”
Grömping U (2019). “South German Credit Data: Correcting a Widely Used Data Set.” Reports in Mathematics, Physics and Chemistry 4, Department II, Beuth University of Applied Sciences Berlin.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html
Package mlr3data for more toy tasks.
Package mlr3oml for downloading tasks from https://www.openml.org.
Package mlr3viz for some generic visualizations.
Dictionary of Tasks: mlr_tasks
as.data.table(mlr_tasks)
for a table of available Tasks in the running session (depending on the loaded packages).
mlr3fselect and mlr3filters for feature selection and feature filtering.
Extension packages for additional task types:
Unsupervised clustering: mlr3cluster
Probabilistic supervised regression and survival analysis: https://mlr3proba.mlr-org.com/.
Other Task:
Task
,
TaskClassif
,
TaskRegr
,
TaskSupervised
,
TaskUnsupervised
,
california_housing
,
mlr_tasks
,
mlr_tasks_breast_cancer
,
mlr_tasks_iris
,
mlr_tasks_mtcars
,
mlr_tasks_penguins
,
mlr_tasks_pima
,
mlr_tasks_sonar
,
mlr_tasks_spam
,
mlr_tasks_wine
,
mlr_tasks_zoo
task = tsk("german_credit")
costs = matrix(c(0, 1, 5, 0), nrow = 2)
dimnames(costs) = list(predicted = task$class_names, truth = task$class_names)
measure = msr("classif.costs", id = "german_credit_costs", costs = costs)
print(measure)
Run the code above in your browser using DataLab