clust_cp: Clustering time dependent observations with common change points.

Description

The clust_cp function cluster observations with common change points. Data can be time series or epidemic diffusions.

Usage

clust_cp(
  data,
  n_iterations,
  n_burnin = 0,
  params = list(),
  alpha_SM = 1,
  B = 1000,
  L = 1,
  q = 0.5,
  kernel,
  print_progress = TRUE,
  user_seed = 1234
)

Value

A ClustCpObj class object containing

$data Vector or matrix containing the data.
$n_iterations Total number of MCMC iterations.
$n_burnin Number of burn-in iterations.
$clust A matrix where each row corresponds to the cluster assignment from each iteration.
$orders A multidimensional array where each slice is a matrix representing the latent order at each iteration.
$time Total computational time (in seconds).
$entropy_MCMC A coda::mcmc object containing the MCMC samples of the entropy.
$lkl A coda::mcmc object containing the log-likelihood evaluated at each iteration.
$norm_vec A vector containing the normalization constants computed at the beginning of the algorithm.
$I0_MCMC A coda::mcmc object containing the MCMC trace of the initial infection proportion $I_0$.
$kernel_ts TRUE if the kernel used corresponds to time series data.
$kernel_epi TRUE if the kernel used corresponds to epidemic diffusion data.
$univariate_ts TRUE if the data represent a univariate time series, FALSE if multivariate.

Arguments

data

a matrix or an array If a matrix the algorithm for univariate time series is used, where each row is a time series. If an array, the algorithm is run for multivariate time series. Each slice of the array is a matrix where the rows are the dimensions of the time series.

n_iterations

number of MCMC iterations.

n_burnin

number of iterations that must be excluded when computing the posterior estimate.

params

a list of parameters:

If the time series is univariate the following must be specified:

a,b,c parameters of the integrated likelihood.
phi correlation parameter in the likelihood.

If the time series is multivariate the following must be specified:

k_0, nu_0, S_0, m_0 parameters of the integrated likelihood.
phi correlation parameter in the likelihood.

If data are epidemic diffusions:

M number of Monte Carlo iterations when computing the likelihood of the epidemic diffusion.
xi recovery rate fixed constant for each population at each time.
a0, b0 parameters for the computation of the integrated likelihood of the epidemic diffusions.
I0_var variance for the Metropolis-Hastings estimation of the proportion of infected at time 0.
avg_blk prior average number of change points for each order.

alpha_SM

$\alpha$ for the split-merge main algorithm.

B

number of orders for the normalization constant.

L

number of split-merge steps for the proposal step.

q

probability of a split in the split-merge proposal and acceleration step.

kernel

can be "ts" if data are time series or "epi" if data are epidemic diffusions.

print_progress

If TRUE (default) print the progress bar.

user_seed

seed for random distribution generation.

References

Corradin, R., Danese, L., KhudaBukhsh, W. R., & Ongaro, A. (2026). Model-based clustering of time-dependent observations with common structural changes. Statistics and Computing. tools:::Rd_expr_doi("10.1007/s11222-025-10756-x")

Examples

Run this code


# \donttest{
## Univariate time series

data("stock_uni")

params_uni <- list(a = 1,
                   b = 1,
                   c = 1,
                   phi = 0.1)

out <- clust_cp(data = stock_uni[1:5,], n_iterations = 2000, n_burnin = 500,
                L = 1, q = 0.5, B = 1000, params = params_uni, kernel = "ts")

print(out)

## Multivariate time series

data("stock_multi")

params_multi <- list(m_0 = rep(0,2),
                     k_0 = 1,
                     nu_0 = 10,
                     S_0 = diag(1,2,2),
                     phi = 0.1)

out <- clust_cp(data = stock_multi[,,1:5], n_iterations = 2000, n_burnin = 500,
                L = 1, B = 1000, params = params_multi, kernel = "ts")

print(out)

## Epidemic diffusions

data("epi_synthetic_multi")

params_epi <- list(M = 100, xi = 1/8,
                   alpha_SM = 1,
                   a0 = 4,
                   b0 = 10,
                   I0_var = 0.1,
                   avg_blk = 2)

out <- clust_cp(epi_synthetic_multi, n_iterations = 2000, n_burnin = 500,
                L = 1, B = 1000, params = params_epi, kernel = "epi")

print(out)

# }

Run the code above in your browser using DataLab