Learn R Programming

⚠️There's a newer version (1.2.0) of this package.Take me there.

R package rEMM: Extensible Markov Model for Modelling Temporal Relationships Between Clusters

Implements TRACDS (Temporal Relationships between Clusters for Data Streams), a generalization of Extensible Markov Model (EMM), to model transition probabilities in sequence data. TRACDS adds a temporal or order model to data stream clustering by superimposing a dynamically adapting Markov Chain. Also provides an implementation of EMM (TRACDS on top of tNN data stream clustering).

Installation

Stable CRAN version: install from within R with

install.packages("rEMM")

Current development version: Install from r-universe.

Usage

We use a artificial dataset with a mixture of four clusters components. Points are generated using a fixed sequence <1,2,1,3,4> through the four clusters. The lines below indicate the sequence.

library(rEMM)

data("EMMsim")

plot(EMMsim_train, pch = EMMsim_sequence_train)
lines(EMMsim_train, col = "gray")

EMM recovers the components and the sequence information. We use EMM and then recluster the found structure assuming that we know that there are 4 components. The graph below represents a Markov model of the found sequence.

emm <- EMM(threshold = 0.1, measure = "euclidean")
build(emm, EMMsim_train)
emmc <- recluster_hclust(emm, k = 4, method = "average")
plot(emmc)

We can now score new sequences (we use a test sequence created in the same way as the training data) by calculating the product the transition probabilities in the model. The high score indicates this.

score(emmc, EMMsim_test)
## [1] 0.71

References

Acknowledgements

Development of this package was supported in part by NSF IIS-0948893 and R21HG005912 from the National Human Genome Research Institute.

Copy Link

Version

Install

install.packages('rEMM')

Monthly Downloads

674

Version

1.1.1

License

GPL-2

Issues

Pull Requests

Stars

Forks

Maintainer

Michael Hahsler

Last Published

May 31st, 2022

Functions in rEMM (1.1.1)

Derwent

Derwent Catchment Data
EMMTraffic

Hypothetical Traffic Data Set for EMM
16S

Count Data for 16S rRNA Sequences
EMM-class

Class "EMM"
build

Building an EMM using New Data
cluster

Data stream clustering with tNN
EMM

Creator for Class "EMM"
TRAC

TRAC: Creating a Markov Model from a Regular Clustering
EMMsim

Synthetic Data to Demonstrate EMMs
remove

Remove States/Clusters or Transitions from an EMM
find_clusters

Find the EMM State/Cluster for an Observation
TRACDS-class

Class "TRACDS"
prune

Prune States and/or Transitions
recluster

Reclustering EMM states
smooth_transitions

Smooths transition counts between neighboring states/clusters
update

Update a TRACDS temporal structure with new state assignements
transition_table

Extract a Transition Table for a New Sequence Given an EMM
score

Score a New Sequence Given an EMM
synthetic_stream

Create a Synthetic Data Stream
merge_clusters

Merge States of an EMM
tNN-class

Class "tNN"
plot

Visualize EMM Objects
predict

Predict a Future State
fade

Fading Cluster Structure and EMM Layer
transition

Access Transition Probabilities/Counts in an EMM
combine

Combining EMM Objects