Learn R Programming

stream (version 1.0-3)

DSD_Wrapper: A Data Stream Wrapper for Data.Frames or Matrix-like Objects

Description

This class wraps data.frame or matrix-like objects and provides access to the data in a streaming fashion. The data can either be looped or replayed manually to give the exact same data several times. The Wrapper can also be used to record and replay a part of a data stream.

Usage

DSD_Wrapper(x, n, k=NA, loop=FALSE, assignment = NULL, description=NULL)

Arguments

x
A stream object, a data frame or other matrix-like object with the data to be used in the stream. If x is a DSD object then a wrapper for n data points from this DSD is created.
n
Number of points used if x is a DSD object. If x is a data frame or matrix then n is ignored.
k
Optional: The number of clusters
loop
A flag that tells the stream to loop or not to loop over the data frame.
assignment
Index of the column containing the group assignment (ground truth) or a vector with the assignment.
description
character string with a description.

Value

  • Returns a DSD_Wrapper object (subclass of DSD_R, DSD).

Details

In addition to regular data.frames other matrix-like objects can be used. This includes also ffdf (large data.frames stored on disk) from package ff and big.matrix from bigmemory.

See Also

DSD, reset_stream

Examples

Run this code
### wrap 1000 points from a dsd
dsd <- DSD_Gaussians(k=3, d=2)
replayer <- DSD_Wrapper(dsd, k=3, n=1000)
replayer
plot(replayer)  
  
# creating 2 clusterers of different algorithms
dsc1 <- DSC_tNN(r=0.1)
dsc2 <- DSC_DStream(gridsize=0.1)
  
# clustering the same data in 2 DSC objects
reset_stream(replayer) # resetting the replayer to the first position
cluster(dsc1, replayer, 500)
reset_stream(replayer)
cluster(dsc2, replayer, 500)
  
# plot the resulting clusterings
reset_stream(replayer) 
plot(dsc1, replayer, main="tNN")
reset_stream(replayer) 
plot(dsc2, replayer, main="D-Stream")   
  
### use a data.frame to create a stream (3rd col. contains the assignment)
df <- data.frame(x=runif(100), y=runif(100), 
  assignment=sample(1:3, 100, replace=TRUE))
head(df)  

dsd <- DSD_Wrapper(df, assignment=3)  
dsd
plot(dsd, n=100)

Run the code above in your browser using DataLab