Learn R Programming

⚠️There's a newer version (2.0-3) of this package.Take me there.

stream - Infrastructure for Data Stream Mining - R package

The package provides support for modeling and simulating data streams as well as an extensible framework for implementing, interfacing and experimenting with algorithms for various data stream mining tasks. The main advantage of stream is that it seamlessly integrates with the large existing infrastructure provided by R. The package currently focuses on data stream clustering and provides implementations of BICO, BIRCH, D-Stream, DBSTREAM, and evoStream.

Additional packages in the stream family are:

  • streamMOA: Interface to clustering algorithms implemented in the MOA framework. Includes implementations of DenStream, ClusTree and CluStream.

The development of the stream package was supported in part by NSF IIS-0948893 and NIH R21HG005912.

Installation

Stable CRAN version: install from within R with

install.packages("stream")

Current development version: Download package from AppVeyor or install from GitHub (needs devtools).

install_git("mhahsler/stream")

Usage

Load the package and create micro-clusters via sampling.

library("stream")
stream <- DSD_Gaussians(k=3, noise=0)

sample <- DSC_Sample(k=20)
update(sample, stream, 500)
sample
Reservoir sampling
Class: DSC_Sample, DSC_Micro, DSC_R, DSC
Number of micro-clusters: 20

Recluster micro-clusters using k-means and plot results

kmeans <- DSC_Kmeans(k=3)
recluster(kmeans, sample)
plot(kmeans, stream, type="both")

A list of all available clustering methods can be obtained with

DSC_registry$get_entries()

References

Copy Link

Version

Install

install.packages('stream')

Monthly Downloads

1,097

Version

1.5-1

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Michael Hahsler

Last Published

May 9th, 2022

Functions in stream (1.5-1)

DSC_Macro

Abstract Class for Macro Clusterers
DSC_BIRCH

Balanced Iterative Reducing Clustering using Hierarchies
DSC_DBSTREAM

DBSTREAM clustering algorithm
DSC_BICO

BICO - Fast computation of k-means coresets in a data stream
DSC_EA

Reclustering using an Evolutionary Algorithm
DSC_DStream

D-Stream Data Stream Clustering Algorithm
DSC_DBSCAN

DBSCAN Macro-clusterer
DSC_Kmeans

Kmeans Macro-clusterer
DSC

Data Stream Clusterer Base Classes
DSC_Sample

Extract a Fixed-size Sample from a Data Stream
DSC_evoStream

evoStream - Evolutionary Stream Clustering
DSD_BarsAndGaussians

Data Stream Generator for Bars and Gaussians
DSD

Data Stream Data Generator Base Classes
DSD_Gaussians

Mixture of Gaussians Data Stream Generator
DSO_Sample

Sampling from a Data Stream (Data Stream Operator)
DSD_MG

DSD Moving Generator
DSClassify

Abstract Class for Data Stream Classifiers
DSD_ScaleStream

Scale a Stream from a DSD
DSD_ReadDB

Read a Data Stream from an open DB Query
DSFP

Abstract Class for Frequent Pattern Mining Algorithms for Data Streams
DSC_Hierarchical

Hierarchical Micro-Cluster Reclusterer
DSD_Benchmark

Data Stream Generator for Benchmark Data
DSO_Window

Sliding Window (Data Stream Operator)
DSD_Cubes

Static Cubes Data Stream Generator
DSC_Static

Create as Static Copy of a Clustering
DSC_Micro

Abstract Class for Micro Clusterers
DSC_Reachability

Reachability Micro-Cluster Reclusterer
DSD_Target

Target Data Stream Generator
saveDSC

Save and Read DSC Objects
reset_stream

Reset a Data Stream to its Beginning
DSO

Data Stream Operator Base Classes
get_points

Get Points from a Data Stream Generator
get_assignment

Assignment Data Points to Clusters
MGC

Moving Generator Cluster
DSD_UniformNoise

Uniform Noise Data Stream Generator
DefaultEvalCallback-class

Default Class for Evaluation Callbacks
DSD_mlbenchData

Stream Interface for Data Sets From mlbench
EvalCallback-class

Abstract Class for Evaluation Callbacks
SampleDSO-class

Update a Data Stream Clustering Model
DSC_TwoStage

TwoStage Clustering Process
evaluate

Evaluate Clusterings
animate

Animates the plotting of a DSD and the clustering process
DSD_ReadCSV

Read a Data Stream from File
DSC_Window

A sliding window from a Data Stream
plot.DSC

Plotting Data Stream Data and Clusterings
DST

Conceptual Base Class for All Data Stream Mining Tasks
microToMacro

Translate Micro-cluster IDs to Macro-cluster IDs
DSOutlier

Abstract Class for Outlier Detection Clusterers
DSD_Memory

A Data Stream Interface for Data Stored in Memory
DSD_mlbenchGenerator

mlbench Data Stream Generator
prune_clusters

Prune Clusters from a Clustering
recluster

Re-clustering micro-clusters
write_stream

Write a Data Stream to a File