Learn R Programming

shatteringdtR

Provide SLT Tools for 'rpart' and 'tree' to Study Decision Trees

Description

Learning, in Machine Learning (ML) area, is one of the most important steps in the construction of algorithms that seek to predict a certain task, whether this is the classification of objects, the forecast of demand for a specific product or even the diagnosis of malignant diseases. In ML, we can study supervised (which have a label, e.g., a class) and unsupervised algorithms, used for tasks such as pattern detection, grouping, among others that do not depend directly on a label. Knowing this, the present work aims to carry out the study of different supervised learning algorithms, in this case, the classification algorithms, more specifically Decision Trees, to carry out an analytical study about the steps that make up the learning process of the algorithm, exploring concepts of the SLT that provide tools for studies and allow to prove issues such as the guarantee of learning of a certain algorithm. Reference: Rodrigo Fernandes de Mello, Chaitanya Manapragada, Albert Bifet: "Measuring the Shattering coefficient of Decision Tree models". Expert Syst. Appl. 137: 443-452 (2019)

About this work

This project is a result of a Master's discipline of Institute of Mathematics and Computer Sciences (ICMC) of University of Sao Paulo (USP)

Copy Link

Version

Install

install.packages('shatteringdt')

Monthly Downloads

5

Version

0.1.0

License

GPL-3

Maintainer

Igor Martinelli

Last Published

March 3rd, 2021

Functions in shatteringdt (0.1.0)

compute_shattering

Calculates the shattering coefficient for a decision tree.
chernoff_bound

Calculates the chernoff bound simulations.
search_n_samples

Executes a binary search to find the best # of samples.
g

Apply G(n) function.
search_delta_n_samples

Search the # of samples to ensure learning given an epsilon.
shattering_simulations

Calculates the shattering coefficient simulations.
compute_delta

Calculates the delta for a given # of samples and value of epsilon.
confidence_interval

Calculates the confidence interval for the dataset.
recurse

Calculates the shattering coefficient for a decision tree.