A simulated dataset for longitudinal data analysis.
data("longDat")
A data frame with 540 observations on the following 4 variables.
sid
subject id
time
time points. A factor with levels time1
time2
time3
time4
time5
time6
y
numeric. outcome variable
grp
subject group. A factor with levels grp1
grp2
grp3
The dataset is generated from the following mixed effects model for repeated measures: $$y_{ij}=\beta_{0i}+\beta_1 t_{j} + \beta_2 grp_{2i} + \beta_3 grp_{3i} + \beta_4 \times\left(t_{j}\times grp_{2i}\right) + \beta_5 \times\left(t_{j}\times grp_{3i}\right) +\epsilon_{ij}, $$ where \(y_{ij}\) is the outcome value for the \(i\)-th subject measured at \(j\)-th time point \(t_{j}\), \(grp_{2i}\) is a dummy variable indicating if the \(i\)-th subject is from group 2, \(grp_{3i}\) is a dummy variable indicating if the \(i\)-th subject is from group 3, \(\beta_{0i}\sim N\left(\beta_0, \sigma_b^2\right)\), \(\epsilon_{ij}\sim N\left(0, \sigma_e^2\right)\), \(i=1,\ldots, n, j=1, \ldots, m\), \(n\) is the number of subjects, and \(m\) is the number of time points.
When \(t_j=0\), the expected outcome value is $$ E\left(y_{ij}\right)=\beta_0+\beta_2 dose_{2i} + \beta_3 dose_{3i}.$$
Hence, we have at baseline $$ E\left(y_{ij}\right)=\beta_0,\; \mbox{for dose 1 group}.$$
$$ E\left(y_{ij}\right)=\beta_0 + \beta_2,\; \mbox{for dose 2 group}.$$
$$ E\left(y_{ij}\right)=\beta_0 + \beta_3,\; \mbox{for dose 3 group}.$$
For dose 1 group, the expected outcome values across time is $$ E\left(y_{ij}\right)=\beta_0+\beta_1 t_{j}.$$
We also can get the expected difference of outcome values between dose 2 group and dose 1 group, between dose 3 group and dose 1 group, and between dose 3 group and dose 2 group: $$ E\left(y_{ij} - y_{i'j}\right) =\beta_2+\beta_4 t_{j},\;\mbox{for subject $i$ in dose 2 group and subject $i'$ in dose 1 group},$$
$$ E\left(y_{kj} - y_{i'j}\right) =\beta_3+\beta_5 t_{j},\;\mbox{for subject $k$ in dose 3 group and subject $i'$ in dose 1 group},$$
$$ E\left(y_{kj} - y_{ij}\right) =\left(\beta_3-\beta_2\right)+\left(\beta_5-\beta_4\right) t_{j},\;\mbox{for subject $i$ in dose 3 group and subject $i$ in dose 2 group}.$$
We set \(n=90\), \(m=6\), \(\beta_0=5\), \(\beta_1=0\), \(\beta_2=0\), \(\beta_3=0\), \(\beta_4=2\), \(\beta_5=-2\), \(\sigma_e=1\), \(\sigma_b=0.5\), and \(t_{ij}=j, j=1, \ldots, m\).
That is, the trajectories for dose 1 group are horizontal with mean intercept at \(5\), the trajectories for dose 2 group are linearly increasing with slope \(2\) and mean intercept \(5\), and the trajectories for dose 3 group are linearly decreasing with slope \(-2\) and mean intercept \(5\).
# NOT RUN {
data(longDat)
print(dim(longDat))
print(longDat[1:3,])
print(table(longDat$time, useNA = "ifany"))
print(table(longDat$grp, useNA = "ifany"))
print(table(longDat$sid, useNA = "ifany"))
print(table(longDat$time, longDat$grp))
# }
Run the code above in your browser using DataLab