
Simulates correlated binary responses assuming a regression model for the marginal probabilities.
rbin(clsize = clsize, intercepts = intercepts, betas = betas,
xformula = formula(xdata), xdata = parent.frame(), link = "logit",
cor.matrix = cor.matrix, rlatent = NULL)
Returns a list that has components:
the simulated binary
responses. Element (
a data frame that includes the simulated
response variables (y), the covariates specified by xformula
,
subjects' identities (id) and the corresponding measurement occasions
(time).
the latent random variables denoted by
integer indicating the common cluster size.
numerical (or numeric vector of length clsize
)
containing the intercept(s) of the marginal model.
numerical vector or matrix containing the value of the marginal
regression parameter vector associated with the covariates (i.e., excluding
intercepts
).
formula expression as in other marginal regression models but without including a response variable.
optional data frame containing the variables provided in
xformula
.
character string indicating the link function in the marginal
model. Options include 'probit'
, 'logit'
, 'cloglog'
,
'cauchit'
or 'identity'
. Required when rlatent = NULL
.
matrix indicating the correlation matrix of the
multivariate normal distribution when the NORTA method is employed
(rlatent = NULL
).
matrix with clsize
columns containing realizations of
the latent random vectors when the NORTA method is not preferred. See
details for more info.
Anestis Touloumis
The formulae are easier to read from either the Vignette or the Reference Manual (both available here).
The assumed marginal model is link
. For subject
The binary response
When intercepts
should be provided as a single number. Otherwise, intercepts
must be
provided as a numeric vector such that the
betas
should be provided as a numeric vector only when
betas
must be
provided as a numeric matrix with clsize
rows such that the
betas
should reflect the order of the terms implied by
xformula
.
The appropriate use of xformula
is xformula = ~ covariates
,
where covariates
indicate the linear predictor as in other marginal
regression models.
The optional argument xdata
should be provided in ``long'' format.
The NORTA method is the default option for simulating the latent random
vectors denoted by rlatent
argument. In this case,
element (rlatent
represents the realization of
Cario, M. C. and Nelson, B. L. (1997) Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. Technical Report, Department of Industrial Engineering and Management Sciences, Northwestern University, Evanston, Illinois.
Emrich, L. J. and Piedmonte, M. R. (1991) A method for generating high-dimensional multivariate binary variates. The American Statistician 45, 302--304.
Li, S. T. and Hammond, J. L. (1975) Generation of pseudorandom numbers with specified univariate distributions and correlation coefficients. IEEE Transactions on Systems, Man and Cybernetics 5, 557--561.
Touloumis, A. (2016) Simulating Correlated Binary and Multinomial Responses under Marginal Model Specification: The SimCorMultRes Package. The R Journal 8, 79--91.
rmult.bcl
for simulating correlated nominal
responses, rmult.clm
, rmult.crm
and
rmult.acl
for simulating correlated ordinal responses.
## See Example 3.5 in the Vignette.
set.seed(123)
sample_size <- 5000
cluster_size <- 4
beta_intercepts <- 0
beta_coefficients <- 0.2
latent_correlation_matrix <- toeplitz(c(1, 0.9, 0.9, 0.9))
x <- rep(rnorm(sample_size), each = cluster_size)
simulated_binary_dataset <- rbin(clsize = cluster_size,
intercepts = beta_intercepts, betas = beta_coefficients,
xformula = ~x, cor.matrix = latent_correlation_matrix, link = "probit")
library(gee)
binary_gee_model <- gee(y ~ x, family = binomial("probit"), id = id,
data = simulated_binary_dataset$simdata)
summary(binary_gee_model)$coefficients
## See Example 3.6 in the Vignette.
set.seed(8)
library(evd)
simulated_latent_variables1 <- rmvevd(sample_size, dep = sqrt(1 - 0.9),
model = "log", d = cluster_size)
simulated_latent_variables2 <- rmvevd(sample_size, dep = sqrt(1 - 0.9),
model = "log", d = cluster_size)
simulated_latent_variables <- simulated_latent_variables1 -
simulated_latent_variables2
simulated_binary_dataset <- rbin(clsize = cluster_size,
intercepts = beta_intercepts, betas = beta_coefficients,
xformula = ~x, rlatent = simulated_latent_variables)
binary_gee_model <- gee(y ~ x, family = binomial("logit"), id = id,
data = simulated_binary_dataset$simdata)
summary(binary_gee_model)$coefficients
Run the code above in your browser using DataLab