intercorr_cont_nb

the method used to generate the <code>k_cont</code> continuous variables. "Fleishman" uses a third-order polynomial transformation
and "Polynomial" uses Headrick's fifth-order transformation.

method

a matrix with <code>k_cont</code> rows, each a vector of constants c0, c1, c2, c3 (if <code>method</code> = "Fleishman") or
c0, c1, c2, c3, c4, c5 (if <code>method</code> = "Polynomial"), like that returned by <code><a rd-options="SimMultiCorrData" href="/link/find_constants?package=SimCorrMix&version=0.1.1&to=SimMultiCorrData" data-mini-rdoc="SimMultiCorrData::find_constants">find_constants</a></code>

constants

a <code>k_cont x k_nb</code> matrix of target correlations among continuous and Negative Binomial variables; the NB variables
should be ordered 1st regular, 2nd zero-inflated

rho_cont_nb

a vector of size parameters for the Negative Binomial variables (see <code>stats::dnbinom</code>); the order should be
1st regular NB variables, 2nd zero-inflated NB variables

size

a vector of mean parameters for the NB variables (*Note: either <code>prob</code> or <code>mu</code> should be supplied for all Negative Binomial variables,
not a mixture; default = NULL); order the same as in <code>size</code>; for zero-inflated NB this refers to
the mean of the NB distribution (see <code>VGAM::dzinegbin</code>)

a vector of probabilities of structural zeros (not including zeros from the NB distribution) for the zero-inflated NB variables
(see <code>VGAM::dzinegbin</code>); if <code>p_zinb</code> = 0, \(Y_{nb}\) has a regular NB distribution;
if <code>p_zinb</code> is in <code>(-prob^size/(1 - prob^size),</code> <code>0)</code>, \(Y_{nb}\) has a zero-deflated NB distribution and <code>p_zinb</code>
is not a probability; if <code>p_zinb = -prob^size/(1 - prob^size)</code>, \(Y_{nb}\) has a positive-NB distribution (see
<code>VGAM::dposnegbin</code>); if <code>length(p_zinb) &lt; length(size)</code>, the missing values are set to 0 (and ordered 1st)

p_zinb

the number of random numbers to generate in calculating the bound (default = 10000)

nrand

the seed used in random number generation (default = 1234)

seed

This function calculates a <code>k_cont x k_nb</code> intermediate matrix of correlations for the <code>k_cont</code> continuous and
 <code>k_nb</code> Negative Binomial variables. It extends the method of Amatya &amp; Demirtas (2015, 10.1080/00949655.2014.953534) to
 continuous variables generated using Headrick's fifth-order polynomial transformation and regular or zero-inflated NB variables.
 Here, the intermediate correlation between Z1 and Z2 (where Z1 is the standard normal variable transformed using Headrick's fifth-order
 or Fleishman's third-order method to produce a continuous variable Y1, and Z2 is the standard normal variable used to generate a
 Negative Binomial variable via the inverse CDF method) is calculated by dividing the target correlation by a correction factor.
 The correction factor is the product of the upper Frechet-Hoeffding bound on the correlation between a Negative Binomial variable and
 the normal variable used to generate it and the power method correlation (described in Headrick &amp; Kowalchuk, 2007,
 10.1080/10629360600605065) between Y1 and Z1. The function is used in <code><a rd-options="SimCorrMix" href="/link/intercorr?package=SimCorrMix&version=0.1.1&to=SimCorrMix" data-mini-rdoc="SimCorrMix::intercorr">intercorr</a></code> and
 <code><a rd-options="SimCorrMix" href="/link/corrvar?package=SimCorrMix&version=0.1.1&to=SimCorrMix" data-mini-rdoc="SimCorrMix::corrvar">corrvar</a></code>. This function would not ordinarily be called by the user.

correlation

continuous

NegativeBinomial

method1

Generate continuous (normal, non-normal, or mixture distributions), binary, ordinal,
and count (regular or zero-inflated, Poisson or Negative Binomial) variables with a specified
correlation matrix, or one continuous variable with a mixture distribution. This package can
be used to simulate data sets that mimic real-world clinical or genetic data sets (i.e.,
plasmodes, as in Vaughan et al., 2009 <DOI:10.1016/j.csda.2008.02.032>). The methods
extend those found in the 'SimMultiCorrData' R package. Standard normal variables with an
imposed intermediate correlation matrix are transformed to generate the desired distributions.
Continuous variables are simulated using either Fleishman (1978)'s third order
<DOI:10.1007/BF02293811> or Headrick (2002)'s fifth order
<DOI:10.1016/S0167-9473(02)00072-5> polynomial transformation method (the power method
transformation, PMT). Non-mixture distributions require the user to specify mean, variance,
skewness, standardized kurtosis, and standardized fifth and sixth cumulants. Mixture
distributions require these inputs for the component distributions plus the mixing
probabilities. Simulation occurs at the component level for continuous mixture
distributions. The target correlation matrix is specified in terms of correlations with
components of continuous mixture variables. These components are transformed into the
desired mixture variables using random multinomial variables based on the mixing
probabilities. However, the package provides functions to approximate expected correlations
with continuous mixture variables given target correlations with the components. Binary and
ordinal variables are simulated using a modification of ordsample() in package 'GenOrd'.
Count variables are simulated using the inverse CDF method. There are two simulation
pathways which calculate intermediate correlations involving count variables differently.
Correlation Method 1 adapts Yahav and Shmueli's 2012 method <DOI:10.1002/asmb.901> and
performs best with large count variable means and positive correlations or small means and
negative correlations. Correlation Method 2 adapts Barbiero and Ferrari's 2015
modification of the 'GenOrd' package <DOI:10.1002/asmb.2072> and performs best under the
opposite scenarios. The optional error loop may be used to improve the accuracy of the
final correlation matrix. The package also contains functions to calculate the
standardized cumulants of continuous mixture distributions, check parameter inputs,
calculate feasible correlation boundaries, and summarize and plot simulated variables.

Allison Fialkowski

SimCorrMix

Simulation of Correlated Data with Multiple Variable Types
Including Continuous and Count Mixture Distributions

intercorr_cont_nb function

a matrix with <code>k_cont</code> rows, each a vector of constants c0, c1, c2, c3 (if <code>method</code> = "Fleishman") or
c0, c1, c2, c3, c4, c5 (if <code>method</code> = "Polynomial"), like that returned by <code><a rd-options='SimMultiCorrData' href='find_constants'>find_constants</a></code>

This function calculates a <code>k_cont x k_nb</code> intermediate matrix of correlations for the <code>k_cont</code> continuous and
 <code>k_nb</code> Negative Binomial variables. It extends the method of Amatya &amp; Demirtas (2015, 10.1080/00949655.2014.953534) to
 continuous variables generated using Headrick's fifth-order polynomial transformation and regular or zero-inflated NB variables.
 Here, the intermediate correlation between Z1 and Z2 (where Z1 is the standard normal variable transformed using Headrick's fifth-order
 or Fleishman's third-order method to produce a continuous variable Y1, and Z2 is the standard normal variable used to generate a
 Negative Binomial variable via the inverse CDF method) is calculated by dividing the target correlation by a correction factor.
 The correction factor is the product of the upper Frechet-Hoeffding bound on the correlation between a Negative Binomial variable and
 the normal variable used to generate it and the power method correlation (described in Headrick &amp; Kowalchuk, 2007,
 10.1080/10629360600605065) between Y1 and Z1. The function is used in <code><a rd-options='SimCorrMix' href='intercorr'>intercorr</a></code> and
 <code><a rd-options='SimCorrMix' href='corrvar'>corrvar</a></code>. This function would not ordinarily be called by the user.

intercorr_cont_nb: Calculate Intermediate MVN Correlation for Continuous - Negative Binomial Variables: Correlation Method 1

Description

Usage

Arguments

Value

References

See Also