RMultBinary: Simulating a multivariate Bernoulli distribution

Description

This function generates a sample from a multinomial distribution of \(K\) dependent binary (Bernoulli) variables \((X_1, X_2, ..., X_K)\) defined by an array (of 2^K cells) detailing the joint-probabilities.

Usage

RMultBinary(n = 1, mult.bin.dist, target.values = NULL)

Arguments

Desired sample size. Default = 1.

mult.bin.dist

A list describing the multivariate binary distribution. It can be generated by the ObtainMultBinaryDist function. The list contains at least the element joint.proba, an array detailing the joint-probabilities of the \(K\) binary variables. The array has \(K\) dimensions of size 2, referring to the 2 possible outcomes of the considered variable. Hence, the total number of elements is \(2^K\). Additionnaly the list can also provides the element var.label, a list containing the names of the \(K\) variables.

target.values

A list describing the possibles outcomes of each binary variable, for instance {1, 2}. Default = {0, 1}.

Value

A list whose elements are detailed herehunder.

binary.sequences

The generated \(K \times n\) random sequence.

possible.binary.sequences

The possible binary sequences, i.e. the domain.

chosen.random.index

The index of the random draws in the domain.

References

Lee, A.J. (1993). Generating Random Binary Deviates Having Fixed Marginal Distributions and Specified Degrees of Association. The American Statistician 47 (3): 209-215.

Qaqish, B. F., Zink, R. C., and Preisser, J. S. (2012). Orthogonalized residuals for estimation of marginally specified association parameters in multivariate binary data. Scandinavian Journal of Statistics 39, 515-527.

Examples

Run this code

# NOT RUN {
# from Qaqish et al. (2012)
or <- matrix(c(Inf, 0.281, 2.214, 2.214,
               0.281, Inf, 2.214, 2.214,
               2.214, 2.214, Inf, 2.185,
               2.214, 2.214, 2.185, Inf), nrow = 4, ncol = 4, byrow = TRUE)
rownames(or) <- colnames(or) <- c("Parent1", "Parent2", "Sibling1", "Sibling2")

# hypothetical marginal probabilities
p <- c(0.2, 0.4, 0.6, 0.8)

# estimating the joint-distribution
p.joint <- ObtainMultBinaryDist(odds = or, marg.probs = p)

# simulating 100,000 draws from the obtained joint-distribution
y.sim <- RMultBinary(n = 1e5, mult.bin.dist = p.joint)$binary.sequences

# checking results
cat('dim y.sim =', dim(y.sim)[1], 'x', dim(y.sim)[2], '\n')
cat('Estimated marginal probs from simulated data\n')
apply(y.sim,2,mean)
cat('True probabilities\n')
print(p)
cat('Estimated correlation from simulated data\n')
cor(y.sim)
cat('True correlation\n')
Odds2Corr(or,p)$corr

# generating binary outcomes with outcome different than 0, 1
RMultBinary(n = 10, mult.bin.dist = p.joint, 
            target.values = list(c("A", "B"), c(0, 1), c(1, 2), c(100, 101)))
# }