A simulated dataset of 500 observations based on Simulation Study I (Model 1, Case 2) of Riddles, Kim, and Im (2016). The data features a nonignorable nonresponse (NMAR) mechanism where the response probability depends on the study variable `y`.
riddles_case2A data frame with 500 rows and 4 variables:
Numeric. The auxiliary variable, x ~ Normal(0, 0.5).
Numeric. The study variable with nonignorable nonresponse. `y` contains `NA`s for nonrespondents.
Numeric. The complete, true value of `y` before missingness was introduced.
Integer. The response indicator (1 = responded, 0 = nonresponse).
This dataset was generated using the following model parameters (n = 500):
x ~ Normal(mean = 0, variance = 0.5)
e ~ Normal(mean = 0, variance = 0.9)
y_true = -2 + 0.5 * exp(x) + e
logit(pi) = 0.8 - 0.2 * y_true