Learn R Programming

funcharts (version 1.8.0)

simulate_data_fmrcc: Simulate Data for Functional Mixture Regression Control Chart (FMRCC)

Description

#' @description Generates synthetic in-control and out-of-control functional data for testing the Functional Mixture Regression Control Chart (FMRCC) framework. The function simulates a functional response Y influenced by a functional covariate X through a mixture of functional linear models (FLMs) with three distinct regression structures, as described in Section 3.1 of Capezza et al. (2025).

Usage

simulate_data_fmrcc(
  n_obs = 3000,
  mixing_prop = c(1/3, 1/3, 1/3),
  len_grid = 500,
  SNR = 4,
  shift_coef = c(0, 0, 0, 0),
  severity = 0,
  ncompx = 20,
  delta_1,
  delta_2,
  measurement_noise_sigma = 0,
  fun_noise = "normal",
  df = 3,
  alphasn = 4
)

Value

A list containing:

X

Matrix (len_grid \(\times\) n_obs) of functional covariate observations.

Y

Matrix (len_grid \(\times\) n_obs) of shifted functional response observations.

Eps_1, Eps_2, Eps_3

Matrices of functional error terms for each cluster.

beta_matrix_1, beta_matrix_2, beta_matrix_3

Matrices (len_grid \(\times\) len_grid) containing the bivariate regression coefficient functions \(\beta^X_k(s,t)\) for k=1,2,3.

Arguments

n_obs

Integer. Total number of observations to generate. Default is 3000.

mixing_prop

Numeric vector of length 3. Mixing proportions for the three clusters (must sum to 1). Default is c(1/3, 1/3, 1/3).

len_grid

Integer. Number of grid points for evaluating functional data on domain [0,1]. Default is 500.

SNR

Numeric. Signal-to-noise ratio controlling the variance of the error term. Default is 4.

shift_coef

Numeric vector of length 4 or character string. Controls the type and shape of the mean shift:

  • Numeric vector: Coefficients c(a3, a2, a1, a0) for polynomial shift: \(Shift(t) = severity \times (a_3 t^3 + a_2 t^2 + a_1 t + a_0)\)

  • 'low': Applies a "low" shift pattern based on RSW dynamic resistance curves

  • 'high': Applies a "high" shift pattern based on RSW dynamic resistance curves

Default is c(0,0,0,0) (no shift).

severity

Numeric. Multiplier controlling the magnitude of the shift. Higher values produce larger shifts. This corresponds to the "Severity Level (SL)" in the simulation study. Default is 0 (no shift).

ncompx

Integer. Number of functional principal components used to generate the functional covariate X. Default is 20.

delta_1

Numeric in [0,1]. Controls dissimilarity between clusters in regression coefficient functions and functional intercepts (analogous to delta_1 in simulate_data_fmrcc). Required parameter with no default.

delta_2

Numeric in [0,1]. Controls the relative contribution of functional intercept vs. regression coefficient function (analogous to delta_2 in simulate_data_fmrcc). Required parameter with no default.

measurement_noise_sigma

Numeric. Standard deviation of Gaussian measurement error added to both X and Y. Default is 0 (no measurement error).

fun_noise

Character. Distribution for functional error term. Options:

  • 'normal': Gaussian errors (default)

  • 't': Student's t-distribution errors with df degrees of freedom

  • 'skewnormal': Skew-normal distribution with skewness parameter alphasn

df

Numeric. Degrees of freedom for Student's t-distribution when fun_noise = 't'. Default is 3.

alphasn

Numeric. Skewness parameter for skew-normal distribution when fun_noise = 'skewnormal'. Default is 4.

Details

The data generation follows Equation (18) in the paper: $$Y(t) = (1 - \Delta_2)\beta^0_k(t) + \int_S \Delta_2(\beta^X_k(s,t))^T X(s)ds + \varepsilon(t)$$

The three clusters are characterized by:

  • Different functional intercepts \(\beta^0_k(t)\) (inspired by dynamic resistance curves in RSW processes)

  • Different bivariate regression coefficient functions \(\beta^X_k(s,t)\)

  • Functional errors with variance adjusted to achieve the specified SNR

Moreover, when when severity != 0, it applies a controlled shift to the functional response Y to simulate out-of-control conditions. The shift types include:

Polynomial shifts: When shift_coef is numeric, a polynomial of degree 3 is applied: \(Shift(t) = severity \times (a_3 t^3 + a_2 t^2 + a_1 t + a_0)\)

Linear shift example: shift_coef = c(0, 0, 1, 0) produces a linear shift

Quadratic shift example: shift_coef = c(0, 1, 0, 0) produces a quadratic shift

RSW-specific shifts: When shift_coef = 'low' or 'high', the function applies shifts based on modifications to the dynamic resistance curve (DRC) parameters, simulating realistic fault patterns in resistance spot welding processes. The functional covariate X is generated using functional principal component analysis with standardized magnitudes (scaled by 1/5).

References

Capezza, C., Centofanti, F., Forcina, D., Lepore, A., and Palumbo, B. (2025). Functional Mixture Regression Control Chart. Annals of Applied Statistics.

Examples

Run this code
# \donttest{
# Generate in-control data with three equally-sized clusters, maximum dissimilarity
data <- simulate_data_fmrcc(n_obs = 300, delta_1 = 1, delta_2 = 0.5, severity = 0)

# In-control single cluster case (delta_1 = 0)
data_single <- simulate_data_fmrcc(n_obs = 300, delta_1 = 0, delta_2 = 0.5, severity = 0)

# In-control clusters differing only in regression coefficients
data_beta_only <- simulate_data_fmrcc(n_obs = 300, delta_1 = 1, delta_2 = 1, severity = 0)

# Add measurement noise and use t-distributed errors
data_t_noise <- simulate_data_fmrcc(n_obs = 300, delta_1 = 1, delta_2 = 0.5, severity = 0,
                                    measurement_noise_sigma = 0.01,
                                    fun_noise = 't', df = 5)

# Generate out-of-control data with linear shift
data_oc <- simulate_data_fmrcc(n_obs = 300,
                               shift_coef = c(0, 0, 1, 0),
                               severity = 2,
                               delta_1 = 1,
                               delta_2 = 0.5)

# Generate OC data with quadratic shift
data_quad <- simulate_data_fmrcc(n_obs = 300,
                                 shift_coef = c(0, 1, 0, 0),
                                 severity = 3,
                                 delta_1 = 1,
                                 delta_2 = 0.5)

# Generate OC data with RSW-specific "low" shift pattern
data_rsw_low <- simulate_data_fmrcc(n_obs = 300,
                                    shift_coef = 'low',
                                    severity = 1.5,
                                    delta_1 = 1,
                                    delta_2 = 0.5)

# Generate OC data with RSW-specific "high" shift pattern
data_rsw_high <- simulate_data_fmrcc(n_obs = 300,
                                     shift_coef = 'high',
                                     severity = 2,
                                     delta_1 = 0.66,
                                     delta_2 = 0.5)
# }

Run the code above in your browser using DataLab