Learn R Programming

⚠️There's a newer version (2.0-3) of this package.Take me there.

stream - Infrastructure for Data Stream Mining - R package

The package provides support for modeling and simulating data streams as well as an extensible framework for implementing, interfacing and experimenting with algorithms for various data stream mining tasks. The main advantage of stream is that it seamlessly integrates with the large existing infrastructure provided by R. The package currently focuses on data stream clustering and provides implementations of BICO, BIRCH, D-Stream, DBSTREAM, and evoStream.

Additional packages in the stream family are:

  • streamMOA: Interface to clustering algorithms implemented in the MOA framework. Includes implementations of DenStream, ClusTree and CluStream.
  • subspaceMOA: Interface to Subspace MOA and its implementations of HDDStream and PreDeConStream.

The development of the stream package was supported in part by NSF IIS-0948893 and NIH R21HG005912.

Installation

Stable CRAN version: install from within R with

install.packages("stream")

Current development version: Download package from AppVeyor or install from GitHub (needs devtools).

install_git("mhahsler/stream")

Usage

Load the package and create micro-clusters via sampling.

library("stream")
stream <- DSD_Gaussians(k=3, noise=0)

sample <- DSC_Sample(k=20)
update(sample, stream, 500)
sample
Reservoir sampling
Class: DSC_Sample, DSC_Micro, DSC_R, DSC 
Number of micro-clusters: 20 

Recluster micro-clusters using k-means and plot results

kmeans <- DSC_Kmeans(k=3)
recluster(kmeans, sample)
plot(kmeans, stream, type="both")

References

Copy Link

Version

Install

install.packages('stream')

Monthly Downloads

1,097

Version

1.3-2

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Michael Hahsler

Last Published

May 4th, 2020

Functions in stream (1.3-2)

DSC_BICO

BICO - Fast computation of k-means coresets in a data stream
DSC_Marco

Abstract Class for Macro Clusterers
DSC_DBSCAN

DBSCAN Macro-clusterer
DSC_EA

Evolutionary Algorithm
DSC_Hierarchical

Hierarchical Micro-Cluster Reclusterer
DSC

Data Stream Clusterer Base Classes
DSC_BIRCH

Balanced Iterative Reducing Clustering using Hierarchies
DSC_Kmeans

Kmeans Macro-clusterer
DSC_DBSTREAM

DBSTREAM clustering algorithm
DSC_DStream

D-Stream Data Stream Clustering Algorithm
DSD

Data Stream Data Generator Base Classes
DSClassify

Abstract Class for Data Stream Classifiers
DSD_Benchmark

Data Stream Generator for Benchmark Data
DSD_Cubes

Static Cubes Data Stream Generator
DSC_Mirco

Abstract Class for Micro Clusterers
DSC_Reachability

Reachability Micro-Cluster Reclusterer
DSC_Sample

Extract a Fixed-size Sample from a Data Stream
DSD_ReadDB

Read a Data Stream from an open DB Query
DSC_evoStream

evoStream - Evolutionary Stream Clustering
DSD_ScaleStream

Scale a Stream from a DSD
DSC_TwoStage

TwoStage Clustering Process
DSD_ReadCSV

Read a Data Stream from File
DSD_BarsAndGaussians

Data Stream Generator for Bars and Gaussians
DSD_Gaussians

Mixture of Gaussians Data Stream Generator
DSD_MG

DSD Moving Generator
DSC_Window

A sliding window from a Data Stream
DSO_Sample

Sampling from a Data Stream (Data Stream Operator)
DSFP

Abstract Class for Frequent Pattern Mining Algorithms for Data Streams
DST

Abstract Base Class for All Data Stream Mining Tasks
get_copy

Create a Deep Copy of a DSC Object
get_points

Get Points from a Data Stream Generator
MGC

Moving Generator Cluster
DSD_Target

Target Data Stream Generator
DSO

Data Stream Operator Base Classes
DSD_Memory

A Data Stream Interface for Data Stored in Memory
DSD_mlbenchData

Stream Interface for Data Sets From mlbench
DSC_Static

Create as Static Copy of a Clustering
get_assignment

Assignment Data Points to Clusters
prune_clusters

Prune Clusters from a Clustering
update

Update a Data Stream Clustering Model
DSD_UniformNoise

Uniform Noise Data Stream Generator
animation

Animates the plotting of a DSD and the clustering process
recluster

Re-clustering micro-clusters
evaluate

Evaluate Clusterings
DSD_mlbenchGenerator

mlbench Data Stream Generator
get_weights

Get Cluster Weights
reset_stream

Reset a Data Stream to its Beginning
DSO_Window

Sliding Window (Data Stream Operator)
microToMacro

Translate Micro-cluster IDs to Macro-cluster IDs
write_stream

Write a Data Stream to a File
save

Save and Read DSC Objects
nclusters

nclusters
plot

Plotting Data Stream Data and Clusterings
get_centers

Get Cluster Centers from a DSC