Learn R Programming

MultiDiscreteRNG (version 0.1.0)

simBinaryCorr.Mix: Calculate intermediate binary correlations for mixed data

Description

This function implements Step 2 of the algorithm to calibrate the intermediate latent-normal correlation matrix used to generate correlated binary variables for a mixture of generalized Poisson (GPD), negative binomial (NB), and binomial (B) margins. For each pair of variables, it iteratively updates the latent correlation so that, after (i) generating correlated binary data via generate.binaryVar and (ii) mapping back to the mixed discrete scales via via BinToMix, the empirical correlation of the resulting mixed pair matches the user-specified target correlation in CorrMat. The calibrated pairwise latent correlations are then assembled into a full intermediate matrix, which is adjusted to be positive definite if needed (via Matrix::nearPD).

Usage

simBinaryCorr.Mix(
  GPD.theta.vec = NULL,
  GPD.lambda.vec = NULL,
  NB.r.vec = NULL,
  NB.prob.vec = NULL,
  B.n.vec = NULL,
  B.prob.vec = NULL,
  CorrMat,
  no.rows,
  steps = 0.025
)

Value

A list containing:

Mixprop

List of proportions for each variable's binary components.

intermat

Intermediate correlation matrix for binary variables (adjusted to be positive definite if needed).

Mlocation

List of location parameters for each variable.

pvec

Vector of binary probabilities for each variable.

Arguments

GPD.theta.vec

Numeric vector of theta parameters for GPD variables (or `NULL` if none).

GPD.lambda.vec

Numeric vector of lambda parameters for GPD variables (must match length of `GPD.theta.vec`).

NB.r.vec

Numeric vector of dispersion parameters (`r`) for NB variables (or `NULL` if none).

NB.prob.vec

Numeric vector of success probabilities for NB variables (must match length of `NB.r.vec`).

B.n.vec

Numeric vector of number of trials for Binomial variables (or `NULL` if none).

B.prob.vec

Numeric vector of success probabilities for Binomial variables (must match length of `B.n.vec`).

CorrMat

Correlation matrix (must be symmetric positive definite with dimensions matching total variables).

no.rows

Integer specifying the number of rows (samples) to generate during intermediate binary sampling.

steps

Numeric step size (default = 0.025) for correlation adjustment in later iterations.

Details

The function first calculates binary probabilities and properties for each distribution family (GPD, NB, Binomial) using helper functions `calc.bin.prob.GPD`, `calc.bin.prob.NB`, and `calc.bin.prob.B`. It then iteratively adjusts pairwise correlations in binary space to match the target correlation structure, using a step size for convergence. If the intermediate matrix is not positive definite, it is adjusted using `Matrix::nearPD`.

Examples

Run this code
GPD.theta = 4
GPD.lambda = 0.03
NB.r = 15
NB.prob = 0.61
M<- c(0.15, 0.2)
N <- diag(2)
N[lower.tri(N)] <- M
cmat<- N + t(N)
diag(cmat) <- 1
binObj = simBinaryCorr.Mix(GPD.theta.vec = GPD.theta, GPD.lambda.vec = GPD.lambda,
                           NB.r.vec = NB.r, NB.prob.vec = NB.prob,
                           CorrMat = cmat, no.rows = 20000, steps= 0.025)

Run the code above in your browser using DataLab