CORElearn-package: R port of CORElearn

Description

The package CORElearn is an R port of CORElearn data mining system. It provides various classification and regression models as well as algorithms for feature selection and evaluation. Several algorithms support parallel multithreaded execution via OpenMP, but this feature is currently not supported on all platforms (e.g., Win32 prior to 2.12, Win64 until full support for gcc 4.4).

Arguments

Details

The main functions are

CoreModelwhich constructs classification or regression model.
- Classification models available:
  - random forests with optional local weighing of basic models
  - decision tree with optional constructive induction in the inner nodes and/or models in the leaves
  - kNN and kNN with Gaussian kernel,
  - naive Bayes.
- Regression models:
  - regression trees with optional constructive induction in the inner nodes and/or models in the leaves,
  - linear models with pruning techniques
  - locally weighted regression
  - kNN and kNN with Gaussian kernel.
predict.CoreModelpredicts with classification model labels and probabilities of new instances. For regression models it returns the predicted function value.
modelEvalcomputes some statistics from predictions
attrEvalevaluates the quality of the attributes (dependent variables) with the selected heuristic method. Feature evaluation algorithms are various variants of Relief algorithms (ReliefF, RReliefF, cost-sensitive ReliefF, ..), gain ratio, gini-index, MDL, DKM, information gain, MSE, MAE, ....
ordEvalevaluates ordinal attributes with ordEval algorithm and visualizes them withplot.ordEval,
infoCoreoutputs certain information about CORElearn methods,
helpCoreprints short description of a given parameter,
paramCoreIOreads/writes parameters for given model from/to file,
versionCoreoutputs version of the package from underlying C++ library.

Some of the internal structures of the C++ part are described in CORElearn-internal.

For an automatically generated list of functions use help(package=CORElearn) or library(help=CORElearn).

For certain platforms multithreaded execution is not supported, since current set of compilers at CRAN do not support OpenMP, but it is possible to recompile the package with appropriate tools and compilers (modify Makefile or Makefile.win in src folder, or consult authors).

References

Marko Robnik-Sikonja, Igor Kononenko: Theoretical and Empirical Analysis of ReliefF and RReliefF. Machine Learning Journal, 53:23-69, 2003 Marko Robnik-Sikonja: Improving Random Forests. In J.-F. Boulicaut et al.(Eds): ECML 2004, LNAI 3210, Springer, Berlin, 2004, pp. 359-370

Marko Robnik-Sikonja, Koen Vanhoof: Evaluation of ordinal attributes at value level. Knowledge Discovery and Data Mining, 14:225-243, 2007

Marko Robnik-Sikonja: Experiments with Cost-sensitive Feature Evaluation. In Lavrac et al.(eds): Machine Learning, Proceedings of ECML 2003, Springer, Berlin, 2003, pp. 325-336 Majority of these references are available also from http://lkm.fri.uni-lj.si/rmarko/papers/

Examples

Run this code

# load the package
library(CORElearn) 
cat(versionCore(),"")

# use iris data set

# build random forests model with certain parameters
model <- CoreModel(Species ~ ., iris, model="rf", 
              selectionEstimator="MDL",minNodeWeight=5,rfNoTrees=100)
print(model)

# prediction with node distribution
pred <- predict.CoreModel(model, iris, rfPredictClass=FALSE)
print(pred)

# Model evaluation
mEval <- modelEval(model, iris[["Species"]], pred$class, pred$prob)
print(mEval)
 
# evaluate features in given data set with selected method
estReliefF <- attrEval(Species ~ ., iris, 
                            estimator="ReliefFexpRank", ReliefIterations=30)
print(estReliefF)
    
# evaluate ordered features with ordEval
profiles <- ordDataGen(200)
est <- ordEval(class ~ ., profiles, ordEvalNoRandomNormalizers=100)
print(est)

Run the code above in your browser using DataLab

Description

Arguments

Details

References

See Also

Examples