generate_data: Generate multivariate time series from the proposed model

Description

Simulates a multivariate time series following the proposed model structure, where the mean component evolves as a random walk with abrupt shifts, overlaid by a stationary VAR(1) process to account for temporal and cross-sectional correlations.

Specifically, at each time point $t = 1, \ldots, n$, the data are generated as $$\mathbf{y}_t = \boldsymbol{\mu}_t + \boldsymbol{\epsilon}_t,$$ where, for $t = 2, \ldots, n$, $$\boldsymbol{\mu}_t = \boldsymbol{\mu}_{t-1} + \boldsymbol{\eta}_t + \boldsymbol{\delta}_t,$$ and $$\boldsymbol{\epsilon}_t = \Phi \boldsymbol{\epsilon}_{t-1} + \boldsymbol{\nu}_t.$$

Here, $\boldsymbol{\eta}_t$ denotes the random walk innovation with covariance $\Sigma_{\boldsymbol{\eta}}$, and $\boldsymbol{\nu}_t$ is the VAR(1) innovation with covariance $\Sigma_{\boldsymbol{\nu}}$. The vector $\boldsymbol{\delta}_t$ is nonzero only at change points.

Usage

generate_data(
  mu0,
  deltas,
  Sig_eta,
  Sig_nu,
  Phi,
  Sig_e1,
  errortype,
  df = 10,
  number_cps,
  lengthofeachpart
)

Value

A numeric matrix of dimension $n \times p$, with $n = (number\_cps+1)\,lengthofeachpart$, containing the simulated observations $\{\mathbf{y}_t\}_{t=1}^n$.

Arguments

mu0: Numeric vector of length $p$. The initial mean vector $\boldsymbol{\mu}_0$.
deltas: A list of numeric vectors, each representing the jump magnitude $\boldsymbol{\delta}_t$ at a change point.
Sig_eta: Numeric $p \times p$ covariance matrix $\Sigma_{\boldsymbol{\eta}}$ of the random walk innovation.
Sig_nu: Numeric $p \times p$ covariance matrix $\Sigma_{\boldsymbol{\nu}}$ of the VAR(1) innovation.
Phi: Numeric $p \times p$ autoregressive coefficient matrix $\Phi$.
Sig_e1: Numeric $p \times p$ initial-state covariance matrix of the VAR(1) process.
errortype: Character; either "n" (Gaussian) or "t" (Student's t) specifying the distribution of the innovations.
df: Degrees of freedom for the t-distribution (used only when `errortype = "t"`). Default is 10.
number_cps: Integer; number of change points ($m$).
lengthofeachpart: Integer; number of observations between consecutive change points ($\tau_{k+1} - \tau_k$).

Details

The total length of the time series is given by $n = (number\_cps + 1) \times lengthofeachpart$, so that the specified change points partition the data into equally sized segments. When $\Sigma_{\boldsymbol{\eta}} = 0$, the model reduces to a piecewise constant mean process with no random walk component. When $\Phi = 0$, the process reduces to a random walk model without vector autoregressive dependence. If both $\Sigma_{\boldsymbol{\eta}} = 0$ and $\Phi = 0$, the model simplifies to the classical piecewise constant setting commonly used in multiple change point analysis. The two innovation components are generated independently.

The innovations $\boldsymbol{\eta}_t$ and $\boldsymbol{\nu}_t$ are drawn either from a multivariate normal distribution (when errortype = "n") using mvrnorm, or from a multivariate Student's t distribution (when errortype = "t") using rmvt.

Examples

Run this code

set.seed(123)
p <- 3
mu0 <- rep(0, p)
deltas <- list(c(3, 0, -3), c(-2, 4, 0))
Sig_eta <- diag(0.01, p)
Sig_nu  <- random_Signu(p, 0)
Phi <- random_Phi(p, p)
Sig_e1 <- get_Sig_e1_approx(Sig_nu, Phi)

Y <- generate_data(mu0, deltas, Sig_eta, Sig_nu, Phi, Sig_e1,
                   errortype = "n", number_cps = 2, lengthofeachpart = 100)
dim(Y)

Run the code above in your browser using DataLab