Generates a random panel data set for simulation studies of the fused extended two-way fixed
effects (FETWFE) estimator by taking an object of class "FETWFE_coefs"
(produced by
genCoefs()
) and using it to simulate data. The function creates a balanced panel
with \(N\) units over \(T\) time periods, assigns treatment status across \(R\)
treated cohorts (with equal marginal probabilities for treatment and non-treatment), and
constructs a design matrix along with the corresponding outcome. The covariates are
generated according to the specified distribution
: by default, covariates are drawn
from a normal distribution; if distribution = "uniform"
, they are drawn uniformly
from \([-\sqrt{3}, \sqrt{3}]\). When \(d = 0\) (i.e. no covariates), no
covariate-related columns or interactions are generated. See the simulation studies section of
Faletto (2025) for details.
simulateData(
coefs_obj,
N,
sig_eps_sq,
sig_eps_c_sq,
distribution = "gaussian",
guarantee_rank_condition = FALSE
)
An object of class "FETWFE_simulated"
, which is a list containing:
A dataframe containing generated data that can be passed to fetwfe()
.
The design matrix \(X\), with \(p\) columns with interactions.
A numeric vector of length \(N \times T\) containing the generated responses.
A character vector containing the names of the generated features (if \(d > 0\)), or simply an empty vector (if \(d = 0\))
The name of the time variable in pdata
The name of the unit variable in pdata
The name of the treatment variable in pdata
The name of the response variable in pdata
The coefficient vector \(\beta\) used for data generation.
A vector of indices indicating the first treatment effect for each treated cohort.
The number of never-treated units.
A vector of counts (of length \(R+1\)) indicating how many units fall into the never-treated group and each of the \(R\) treated cohorts.
Independent cohort assignments (for auxiliary purposes).
The number of columns in the design matrix \(X\).
Number of units.
Number of time periods.
Number of treated cohorts.
Number of covariates.
The idiosyncratic noise variance.
The unit-level noise variance.
An object of class "FETWFE_coefs"
containing the coefficient vector
and simulation parameters.
Integer. Number of units in the panel.
Numeric. Variance of the idiosyncratic (observation-level) noise.
Numeric. Variance of the unit-level random effects.
Character. Distribution to generate covariates.
Defaults to "gaussian"
. If set to "uniform"
, covariates are drawn uniformly
from \([-\sqrt{3}, \sqrt{3}]\).
(Optional). Logical. If TRUE, the returned
data set is guaranteed to have at least d + 1
units per cohort, which is
necessary for the final design matrix to have full column rank. Default is
FALSE, in which case no such condition is enforced.
This function extracts simulation parameters from the FETWFE_coefs
object and passes them,
along with additional simulation parameters, to the internal function simulateDataCore()
.
It validates that all necessary components are returned and assigns the S3 class
"FETWFE_simulated"
to the output.
The argument distribution
controls the generation of covariates. For
"gaussian"
, covariates are drawn from rnorm
; for "uniform"
,
they are drawn from runif
on the interval \([-\sqrt{3}, \sqrt{3}]\) (which ensures that
the covariates have unit variance regardless of which distribution is chosen).
When \(d = 0\) (i.e. no covariates), the function omits any covariate-related columns and their interactions.
Faletto, G (2025). Fused Extended Two-Way Fixed Effects for Difference-in-Differences with Staggered Adoptions. arXiv preprint arXiv:2312.05985. https://arxiv.org/abs/2312.05985.
if (FALSE) {
# Generate coefficients
coefs <- genCoefs(R = 5, T = 30, d = 12, density = 0.1, eff_size = 2, seed = 123)
# Simulate data using the coefficients
sim_data <- simulateData(coefs, N = 120, sig_eps_sq = 5, sig_eps_c_sq = 5)
}
Run the code above in your browser using DataLab