Michael Hahsler

Michael Hahsler

17 packages on CRAN

1 packages on Bioconductor

TSP

cran
97th

Percentile

Basic infrastructure and some algorithms for the traveling salesperson problem (also traveling salesman problem; TSP). The package provides some simple algorithms and an interface to the Concorde TSP solver and its implementation of the Chained-Lin-Kernighan heuristic. The code for Concorde itself is not included in the package and has to be obtained separately.

seriation

cran
97th

Percentile

Infrastructure for seriation with an implementation of several seriation/sequencing techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).

arules

cran
97th

Percentile

Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat.

qap

cran
97th

Percentile

Implements heuristics for the Quadratic Assignment Problem (QAP). Currently only a simulated annealing heuristic is available.

arulesViz

cran
97th

Percentile

Extends package 'arules' with various visualization techniques for association rules and itemsets. The package also includes several interactive visualizations for rule exploration.

dbscan

cran
94th

Percentile

A fast reimplementation of several density-based algorithms of the DBSCAN family for spatial data. Includes the DBSCAN (density-based spatial clustering of applications with noise) and OPTICS (ordering points to identify the clustering structure) clustering algorithms HDBSCAN (hierarchical DBSCAN) and the LOF (local outlier factor) algorithm. The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided.

91th

Percentile

Provides a research infrastructure to test and develop recommender algorithms including UBCF, IBCF, FunkSVD and association rule-based algorithms.

stream

cran
80th

Percentile

A framework for data stream modeling and associated data mining tasks such as clustering and classification. The development of this package was supported in part by NSF IIS-0948893 and NIH R21HG005912.

79th

Percentile

NBMiner is an implementation of the model-based mining algorithm for mining NB-frequent itemsets presented in "Michael Hahsler. A model-based frequency constraint for mining associations from transaction data. Data Mining and Knowledge Discovery, 13(2):137-166, September 2006." In addition an extension for NB-precise rules is implemented.

streamMOA

cran
60th

Percentile

Interface for data stream clustering algorithms implemented in the MOA (Massive Online Analysis) framework.

46th

Percentile

Provides the Jester Dataset for package recommenderlab.

43th

Percentile

Provides the Book-Crossing Dataset for the package recommenderlab.

rEMM

cran
39th

Percentile

Implements TRACDS (Temporal Relationships between Clusters for Data Streams), a generalization of Extensible Markov Model (EMM). TRACDS adds a temporal or order model to data stream clustering by superimposing a dynamically adapting Markov Chain. Also provides an implementation of EMM (TRACDS on top of tNN data stream clustering). Development of this package was supported in part by NSF IIS-0948893 and R21HG005912 from the National Human Genome Research Institute.

rRDP

bioconductor
17th

Percentile

Seamlessly interfaces RDP classifier (version 2.9).

pmml

cran
95th

Percentile

The Predictive Model Markup Language (PMML) is an XML-based language which provides a way for applications to define statistical and data mining models and to share models between PMML compliant applications. More information about PMML and the Data Mining Group can be found at <http:// www.dmg.org>. The generated PMML can be imported into any PMML consuming application, such as the Software AG Zementis scoring engine, which allows for predictive models built in R to be deployed and executed on site, in the cloud (Amazon, IBM, and FICO), in-database (IBM Netezza, Pivotal, Sybase IQ, Teradata and Teradata Aster) or Hadoop (Datameer and Hive).

cba

cran
93th

Percentile

Implements clustering techniques such as Proximus and Rock, utility functions for efficient computation of cross distances and data manipulation.

90th

Percentile

Add-on for arules to handle and mine frequent sequences. Provides interfaces to the C++ implementation of cSPADE by Mohammed J. Zaki.

arulesCBA

cran
86th

Percentile

Provides a function to build an association rule-based classifier for data frames, and to classify incoming data frames using such a classifier.