A list containing a second simulated dataset (df) and its ground-truth
parameters (theta0). This dataset is generated from a **piecewise linear
model**, where the continuous predictor x is segmented into 6 bins, and
different intercept and slope coefficients are applied to each segment.
The dataset df contains $N = 3000$ observations.
piecewise_dataA list with 2 components:
A data frame with 3,000 rows and 2 variables (the simulated data).
A list of 5 elements containing the true parameters used for simulation.
A continuous predictor, uniformly distributed between -3 and 3.
The **Simulated Response Variable** defined by the piecewise linear model.
The list theta0 holds the true values used for simulation, including:
beta: True global intercept (i.e., (0.5)).
Lat: The categorical factor (1 to 6) derived from segmenting x.
alphaLat: Vector of $2 * 6 = 12$ coefficients defining the specific intercept and slope for x within each of the 6 segments.
The underlying model for the response \(\bold{Y}\) is:
$$\bold{Y} = \bold{X}_{Fe}\bold{\beta} + \bold{X}_{Lat}\bold{\alpha}_{Lat} + \bold{\epsilon}$$
where \(\bold{X}_{Fe}\) is the global intercept, and \(\bold{X}_{Lat}\)\(\bold{\alpha}_{Lat}\) models the piecewise relationship of x across the 6 categories defined in theta0$Lat. The error term \(\bold{\epsilon} ~ N(0, 1)\).