Learn R Programming

BayesChange (version 2.3.0)

clust_cp: Clustering time dependent observations with common change points.

Description

The clust_cp function cluster observations with common change points. Data can be time series or epidemic diffusions.

Usage

clust_cp(
  data,
  n_iterations,
  n_burnin = 0,
  params = list(),
  alpha_SM = 1,
  B = 1000,
  L = 1,
  q = 0.5,
  kernel,
  print_progress = TRUE,
  user_seed = 1234
)

Value

A ClustCpObj class object containing

  • $data Vector or matrix containing the data.

  • $n_iterations Total number of MCMC iterations.

  • $n_burnin Number of burn-in iterations.

  • $clust A matrix where each row corresponds to the cluster assignment from each iteration.

  • $orders A multidimensional array where each slice is a matrix representing the latent order at each iteration.

  • $time Total computational time (in seconds).

  • $entropy_MCMC A coda::mcmc object containing the MCMC samples of the entropy.

  • $lkl A coda::mcmc object containing the log-likelihood evaluated at each iteration.

  • $norm_vec A vector containing the normalization constants computed at the beginning of the algorithm.

  • $I0_MCMC A coda::mcmc object containing the MCMC trace of the initial infection proportion \(I_0\).

  • $kernel_ts TRUE if the kernel used corresponds to time series data.

  • $kernel_epi TRUE if the kernel used corresponds to epidemic diffusion data.

  • $univariate_ts TRUE if the data represent a univariate time series, FALSE if multivariate.

Arguments

data

a matrix or an array If a matrix the algorithm for univariate time series is used, where each row is a time series. If an array, the algorithm is run for multivariate time series. Each slice of the array is a matrix where the rows are the dimensions of the time series.

n_iterations

number of MCMC iterations.

n_burnin

number of iterations that must be excluded when computing the posterior estimate.

params

a list of parameters:

If the time series is univariate the following must be specified:

  • a,b,c parameters of the integrated likelihood.

  • phi correlation parameter in the likelihood.

If the time series is multivariate the following must be specified:

  • k_0, nu_0, S_0, m_0 parameters of the integrated likelihood.

  • phi correlation parameter in the likelihood.

If data are epidemic diffusions:

  • M number of Monte Carlo iterations when computing the likelihood of the epidemic diffusion.

  • xi recovery rate fixed constant for each population at each time.

  • a0, b0 parameters for the computation of the integrated likelihood of the epidemic diffusions.

  • I0_var variance for the Metropolis-Hastings estimation of the proportion of infected at time 0.

  • avg_blk prior average number of change points for each order.

alpha_SM

\(\alpha\) for the split-merge main algorithm.

B

number of orders for the normalization constant.

L

number of split-merge steps for the proposal step.

q

probability of a split in the split-merge proposal and acceleration step.

kernel

can be "ts" if data are time series or "epi" if data are epidemic diffusions.

print_progress

If TRUE (default) print the progress bar.

user_seed

seed for random distribution generation.

References

Corradin, R., Danese, L., KhudaBukhsh, W. R., & Ongaro, A. (2026). Model-based clustering of time-dependent observations with common structural changes. Statistics and Computing. tools:::Rd_expr_doi("10.1007/s11222-025-10756-x")

Examples

Run this code

# \donttest{
## Univariate time series

data("stock_uni")

params_uni <- list(a = 1,
                   b = 1,
                   c = 1,
                   phi = 0.1)

out <- clust_cp(data = stock_uni[1:5,], n_iterations = 2000, n_burnin = 500,
                L = 1, q = 0.5, B = 1000, params = params_uni, kernel = "ts")

print(out)

## Multivariate time series

data("stock_multi")

params_multi <- list(m_0 = rep(0,2),
                     k_0 = 1,
                     nu_0 = 10,
                     S_0 = diag(1,2,2),
                     phi = 0.1)

out <- clust_cp(data = stock_multi[,,1:5], n_iterations = 2000, n_burnin = 500,
                L = 1, B = 1000, params = params_multi, kernel = "ts")

print(out)

## Epidemic diffusions

data("epi_synthetic_multi")

params_epi <- list(M = 100, xi = 1/8,
                   alpha_SM = 1,
                   a0 = 4,
                   b0 = 10,
                   I0_var = 0.1,
                   avg_blk = 2)

out <- clust_cp(epi_synthetic_multi, n_iterations = 2000, n_burnin = 500,
                L = 1, B = 1000, params = params_epi, kernel = "epi")

print(out)

# }

Run the code above in your browser using DataLab