Learn R Programming

⚠️There's a newer version (2.0-3) of this package.Take me there.

stream - Infrastructure for Data Stream Mining - R package

The package provides support for modeling and simulating data streams as well as an extensible framework for implementing, interfacing and experimenting with algorithms for various data stream mining tasks. The main advantage of stream is that it seamlessly integrates with the large existing infrastructure provided by R. The package currently focuses on data stream clustering and provides implementations of BICO, BIRCH, D-Stream and DBSTREAM.

Additional packages in the stream family are:

  • streamMOA: Interface to clustering algorithms implemented in the MOA framework. Includes implementations of DenStream, ClusTree and CluStream.
  • subspaceMOA: Interface to Subspace MOA and its implementations of HDDStream and PreDeConStream.

The development of the stream package was supported in part by NSF IIS-0948893 and NIH R21HG005912.

Installation

Stable CRAN version: install from within R with

install.packages("stream")

Current development version: Download package from AppVeyor or install from GitHub (needs devtools).

install_git("mhahsler/stream")

Usage

Load the package and create micro-clusters via sampling.

library("stream")
stream <- DSD_Gaussians(k=3, noise=0)

sample <- DSC_Sample(k=20)
update(sample, stream, 500)
sample
Reservoir sampling
Class: DSC_Sample, DSC_Micro, DSC_R, DSC 
Number of micro-clusters: 20 

Recluster micro-clusters using k-means and plot results

kmeans <- DSC_Kmeans(k=3)
recluster(kmeans, sample)
plot(kmeans, stream, type="both")

References

Copy Link

Version

Install

install.packages('stream')

Monthly Downloads

712

Version

1.3-0

License

GPL-3

Maintainer

Michael Hahsler

Last Published

June 2nd, 2018

Functions in stream (1.3-0)

DSC_Static

Create as Static Copy of a Clustering
DSD_ReadDB

Read a Data Stream from an open DB Query
DSD_Cubes

Static Cubes Data Stream Generator
DSD_ScaleStream

Scale a Stream from a DSD
DSD_Benchmark

Data Stream Generator for Benchmark Data
DSC_TwoStage

TwoStage Clustering Process
DSD_Target

Target Data Stream Generator
DSC_Reachability

Reachability Micro-Cluster Reclusterer
DSD_BarsAndGaussians

Data Stream Generator for Bars and Gaussians
DSC_Sample

Extract a Fixed-size Sample from a Data Stream
DSD_MG

DSD Moving Generator
DSO_Sample

Sampling from a Data Stream (Data Stream Operator)
get_centers

Get Cluster Centers from a DSC
DSD_UniformNoise

Uniform Noise Data Stream Generator
animation

Animates the plotting of a DSD and the clustering process
get_assignment

Assignment Data Points to Clusters
DSD_Gaussians

Mixture of Gaussians Data Stream Generator
nclusters

nclusters
prune_clusters

Prune Clusters from a Clustering
plot

Plotting Data Stream Data and Clusterings
evaluate

Evaluate Clusterings
DSC_Window

A sliding window from a Data Stream
DSO_Window

Sliding Window (Data Stream Operator)
reset_stream

Reset a Data Stream to its Beginning
DSClassify

Abstract Class for Data Stream Classifiers
DSD_mlbenchData

Stream Interface for Data Sets From mlbench
save

Save and Read DSC Objects
recluster

Re-clustering micro-clusters
DSD_mlbenchGenerator

mlbench Data Stream Generator
DST

Abstract Base Class for All Data Stream Mining Tasks
DSFP

Abstract Class for Frequent Pattern Mining Algorithms for Data Streams
DSO

Data Stream Operator Base Classes
MGC

Moving Generator Cluster
DSD_Memory

A Data Stream Interface for Data Stored in Memory
DSD_ReadCSV

Read a Data Stream from File
get_weights

Get Cluster Weights
get_copy

Create a Deep Copy of a DSC Object
microToMacro

Translate Micro-cluster IDs to Macro-cluster IDs
write_stream

Write a Data Stream to a File
update

Update a Data Stream Clustering Model
get_points

Get Points from a Data Stream Generator
DSC

Data Stream Clusterer Base Classes
DSC_BIRCH

Balanced Iterative Reducing Clustering using Hierarchies
DSC_BICO

BICO - Fast computation of k-means coresets in a data stream
DSC_DBSCAN

DBSCAN Macro-clusterer
DSC_Kmeans

Kmeans Macro-clusterer
DSC_DStream

D-Stream Data Stream Clustering Algorithm
DSC_Mirco

Abstract Class for Micro Clusterers
DSC_Hierarchical

Hierarchical Micro-Cluster Reclusterer
DSD

Data Stream Data Generator Base Classes
DSC_DBSTREAM

DBSTREAM clustering algorithm
DSC_Marco

Abstract Class for Macro Clusterers