Learn R Programming

⚠️There's a newer version (0.8.1) of this package.Take me there.

The simstudy package is collection of functions that allow users to generate simulated data sets in order to explore modeling techniques or better understand data generating processes. The user specifies a set of relationships between covariates, and generates data based on these specifications. The final data sets can represent data from randomized control trials, repeated measure (longitudinal) designs, and cluster randomized trials. Missingness can be generated using various mechanisms (MCAR, MAR, NMAR).

Here is some simple sample code, much more in the vignette:

library(simstudy)
## Loading required package: data.table
def <- defData(varname="x", formula = 10, variance = 2)
def <- defData(def, varname="y", formula = "3 + 0.5 * x", variance = 1)
dt <- genData(250, def)

dt <- trtAssign(dt, nTrt = 4, grpName = "grp", balanced = TRUE)

dt
##       id grp         x        y
##   1:   1   3 10.393817 7.805703
##   2:   2   1 10.235161 5.705590
##   3:   3   1 11.517813 8.210183
##   4:   4   1 12.068125 8.618601
##   5:   5   1 10.078817 5.780655
##  ---                           
## 246: 246   4 11.419577 8.442363
## 247: 247   3 10.567231 9.808930
## 248: 248   1 10.451896 7.720858
## 249: 249   3  7.633381 6.861638
## 250: 250   2  9.347781 6.094965

Copy Link

Version

Install

install.packages('simstudy')

Monthly Downloads

1,166

Version

0.1.1

License

GPL-3

Maintainer

Keith Goldfeld

Last Published

June 21st, 2016

Functions in simstudy (0.1.1)

defReadAdd

Read external csv data set definitions for adding columns
addPeriods

Create longitudinal/panel data
addColumns

Add columns to existing data set
defSurv

Add single row to survival definitions
defMiss

Add single row to definitions table for missing data
genCluster

Simulate data set that is one level down in a multilevel data context. The level "2" data set must contain a field that specifies the number of individual records in a particular cluster.
addCorData

Add correlated data to existing data.table
defDataAdd

Add single row to definitions table that will be used to add data to an existing data.table
defData

Add single row to definitions table
defRead

Read external csv data set definitions
genObs

Create an observed data set that includes missing data
genMiss

Generate missing data
trtAssign

Assign treatment
trtObserve

Observed exposure or treatment
mergeData

Merge two data tables
genSurv

Generate missing data
genCorData

Create correlated data
genData

Calling function to simulate data