arulesCBA (version 1.2.6)

discretizeDF.supervised: Supervised Methods to Convert Continuous Variables into Categorical Variables

Description

This function implements several supervised methods to convert continuous variables into a categorical variables (factor) suitable for association rule mining and building associative classifiers. A whole data.frame is discretized (i.e., all numeric columns are discretized).

Usage

discretizeDF.supervised(formula, data, method = "mdlp", dig.lab = 3, ...)

Value

discretizeDF() returns a discretized data.frame. Discretized columns have an attribute "discretized:breaks" indicating the used breaks or and "discretized:method" giving the used method.

Arguments

formula

a formula object to specify the class variable for supervised discretization and the predictors to be discretized in the form class ~ . or class ~ predictor1 + predictor2.

data

a data.frame containing continuous variables to be discretized

method

discretization method. Available are: ``"mdlp", "caim"`, `"cacc"`, `"ameva"`, `"chi2"`, `"chimerge"`, `"extendedchi2"`, and `"modchi2"`.

dig.lab

integer; number of digits used to create labels.

...

Additional parameters are passed on to the implementation of the chosen discretization method.

Author

Michael Hahsler

Details

discretizeDF.supervised() only implements supervised discretization. See discretizeDF() in package arules for unsupervised discretization.

See Also

Unsupervised discretization from arules: discretize(), discretizeDF().

Details about the available supervised discretization methods from discretization: discretization::mdlp, discretization::caim, discretization::cacc, discretization::ameva, discretization::chi2, discretization::chiM, discretization::extendChi2, discretization::modChi2.

Other preparation: CBA_ruleset(), mineCARs(), prepareTransactions(), transactions2DF()

Examples

Run this code
data("iris")
summary(iris)

# supervised discretization using Species
iris.disc <- discretizeDF.supervised(Species ~ ., iris)
summary(iris.disc)

attributes(iris.disc$Sepal.Length)

# discretize the first few instances of iris using the same breaks as iris.disc
discretizeDF(head(iris), methods = iris.disc)

# only discretize predictors Sepal.Length and Petal.Length
iris.disc2 <- discretizeDF.supervised(Species ~ Sepal.Length + Petal.Length, iris)
head(iris.disc2)

Run the code above in your browser using DataLab