50% off | Unlimited Data & AI Learning
Get 50% off unlimited learning

⚠️There's a newer version (2.0-3) of this package.Take me there.

stream - Infrastructure for Data Stream Mining - R package

A framework for data stream modeling and associated data mining tasks such as clustering and classification. The development of this package was supported in part by NSF IIS-0948893 and NIH R21HG005912.

Additional packages in the stream family are:

Installation

Stable CRAN version: install from within R with

install.packages("stream")

Current development version: Download package from AppVeyor or install from GitHub (needs devtools).

install_git("mhahsler/stream")

Usage

Load the package and create micro-clusters via sampling.

library("stream")
stream <- DSD_Gaussians(k=3, noise=0)

sample <- DSC_Sample(k=20)
update(sample, stream, 500)
sample
Reservoir sampling
Class: DSC_Sample, DSC_Micro, DSC_R, DSC 
Number of micro-clusters: 20 

Recluster micro-clusters using k-means and plot results

kmeans <- DSC_Kmeans(k=3)
recluster(kmeans, sample)
plot(kmeans, stream, type="both")

References

Copy Link

Version

Install

install.packages('stream')

Monthly Downloads

1,070

Version

1.2-4

License

GPL-3

Maintainer

Michael Hahsler

Last Published

February 26th, 2017

Functions in stream (1.2-4)

DSD_Gaussians

Mixture of Gaussians Data Stream Generator
DSD_mlbenchData

Stream Interface for Data Sets From mlbench
DSD_MG

DSD Moving Generator
DSD_ReadDB

Read a Data Stream from an open DB Query
DSFP

Abstract Class for Frequent Pattern Mining Algorithms for Data Streams
DSO

Data Stream Operator Base Classes
get_copy

Create a Deep Copy of a DSC Object
plot

Plotting Data Stream Data and Clusterings
nclusters

nclusters
get_points

Get Points from a Data Stream Generator
animation

Animates the plotting of a DSD and the clustering process
update

Update a Data Stream Clustering Model
write_stream

Write a Data Stream to a File
DSD_ScaleStream

Scale a Stream from a DSD
evaluate

Evaluate Clusterings
DSD_mlbenchGenerator

mlbench Data Stream Generator
DSD_Target

Target Data Stream Generator
DSD_Memory

A Data Stream Interface for Data Stored in Memory
DSD_UniformNoise

Uniform Noise Data Stream Generator
get_assignment

Assignment Data Points to Clusters
DSO_Sample

Sampling from a Data Stream (Data Stream Operator)
DSD_ReadCSV

Read a Data Stream from File
DSO_Window

Sliding Window (Data Stream Operator)
reset_stream

Reset a Data Stream to its Beginning
DST

Abstract Base Class for All Data Stream Mining Tasks
save

Save and Read DSC Objects
prune_clusters

Prune Clusters from a Clustering
MGC

Moving Generator Cluster
recluster

Re-clustering micro-clusters
get_centers

Get Cluster Centers from a DSC
get_weights

Get Cluster Weights
microToMacro

Translate Micro-cluster IDs to Macro-cluster IDs
DSC

Data Stream Clusterer Base Classes
DSC_Marco

Abstract Class for Macro Clusterers
DSC_DBSCAN

DBSCAN Macro-clusterer
DSC_Sample

Extract a Fixed-size Sample from a Data Stream
DSC_Static

Create as Static Copy of a Clustering
DSC_Reachability

Reachability Micro-Cluster Reclusterer
DSC_TwoStage

TwoStage Clustering Process
DSD_BarsAndGaussians

Data Stream Generator for Bars and Gaussians
DSC_Mirco

Abstract Class for Micro Clusterers
DSD

Data Stream Data Generator Base Classes
DSC_DBSTREAM

DBSTREAM clustering algorithm
DSC_Kmeans

Kmeans Macro-clusterer
DSD_Cubes

Static Cubes Data Stream Generator
DSC_Hierarchical

Hierarchical Micro-Cluster Reclusterer
DSD_Benchmark

Data Stream Generator for Benchmark Data
DSC_DStream

D-Stream Data Stream Clustering Algorithm
DSC_Window

A sliding window from a Data Stream
DSClassify

Abstract Class for Data Stream Classifiers