Learn R Programming

MEDseq R Package

Mixtures of Exponential-Distance Models

for Clustering Longitudinal Life-Course Sequences

with Gating Covariates and Sampling Weights

Written by Keefe Murphy

Description

Fits MEDseq models introduced by Murphy et al. (2021) <doi:10.1111/rssa.12712>, i.e. fits mixtures of exponential-distance models for clustering longitudinal/categorical life-course sequence data via the EM/CEM algorithm. A family of parsimonious precision parameter constraints are accommodated. So too are sampling weights. Gating covariates can be supplied via formula interfaces. Visualisation of the results of such models is also facilitated.

The most important function in the MEDseq package is: MEDseq_fit, for fitting the models via EM/CEM. This function requires the data to be in "stslist" format; the function seqdef is conveniently reexported from the TraMineR package for this purpose.

MEDseq_control allows supplying additional arguments which govern, among other things, controls on the initialisation of the allocations for the EM/CEM algorithm and the various model selection options. MEDseq_compare is provided for conducting model selection between different results from using different covariate combinations &/or initialisation strategies, etc. MEDseq_stderr is provided for computing the standard errors of the coefficients for the covariates in the gating network.

A dedicated plotting function exists for visualising various aspects of the results, using new methods as well as some existing methods adapted from the TraMineR package. Finally, the package also contains two data sets: biofam and mvad.

Installation

You can install the latest stable official release of the MEDseq package from CRAN:

install.packages("MEDseq")

or the development version from GitHub:

# If required install devtools:  
# install.packages('devtools')  
devtools::install_github('Keefe-Murphy/MEDseq')

In either case, you can then explore the package with:

library(MEDseq)  
help(MEDseq_fit) # Help on the main modelling function

For a more thorough intro, the vignette document is available as follows:

vignette("MEDseq", package="MEDseq")

However, if the package is installed from GitHub, the vignette is not automatically created. It can be accessed when installing from GitHub with the code:

devtools::install_github('Keefe-Murphy/MEDseq', build_vignettes = TRUE)

Alternatively, the vignette is available on the package's CRAN page.

References

Murphy, K., Murphy, T. B., Piccarreta, R., and Gormley, I. C. (2021). Clustering longitudinal life-course sequences using mixtures of exponential-distance models. Journal of the Royal Statistical Society: Series A (Statistics in Society), 184(4): 1414--1451. <doi:10.1111/rssa.12712>.

Copy Link

Version

Install

install.packages('MEDseq')

Monthly Downloads

342

Version

1.4.2

License

GPL (>= 3)

Maintainer

Keefe Murphy

Last Published

March 10th, 2025

Functions in MEDseq (1.4.2)

MEDseq_fit

MEDseq: Mixtures of Exponential-Distance Models with Covariates
MEDseq_clustnames

Automatic labelling of clusters using central sequences
MEDseq_meantime

Compute the mean time spent in each sequence category
MEDseq_news

Show the NEWS file
MEDseq_control

Set control values for use with MEDseq_fit
MEDseq-package

MEDseq: Mixtures of Exponential-Distance Models with Covariates
MEDseq_stderr

MEDseq gating network standard errors
MEDseq_entropy

Entropy of a fitted MEDseq model
MEDseq_compare

Choose the best MEDseq model
MEDseq_AvePP

Average posterior probabilities of a fitted MEDseq model
predict.MEDgating

Predictions from MEDseq gating networks
get_MEDseq_results

Extract results from a MEDseq model
dbs

Compute the Density-based Silhouette
reexports

Objects exported from other packages
mvad

MVAD: Transition from school to work
biofam

Family life states from the Swiss Household Panel biographical survey
dist_freqwH

Pairwise frequency-Weighted Hamming distance matrix for categorical data
wKModes

Weighted K-Modes Clustering with Tie-Breaking
plot.MEDseq

Plot MEDseq results