gendata_mmgfm: Generate simulated data

Description

Generate simulated data from MMGFM models

Usage

gendata_mmgfm(
  seed = 1,
  nvec = c(300, 200),
  pveclist = list(gaussian = c(50, 150), poisson = c(50), binomial = c(100, 60)),
  q = 6,
  d = 3,
  qs = rep(2, length(nvec)),
  rho = rep(1, length(pveclist)),
  rho_z = 1,
  sigmavec = rep(0.5, length(pveclist)),
  n_bin = 1,
  sigma_eps = 1,
  heter_error = FALSE
)

Value

return a list including the following components:

hbeta - a M-length list composed by the estimated regression coefficient matrix for each modality;
hA - a M-length list composed by the loading matrix corresponding to study-shared factors for each modality;
hB - a S-length list composed by a M-length loading matrix list corresponding to study-specified factors for each study;
hF - a S-length list composed by the posterior estimation of study-shared factor matrix for each study;
hH - a S-length list composed by the posterior estimation of study-specified factor matrix for each study;
hSigma - a S-length list composed by the estimated posterior variance of the study-shared factor;
hPhi - a S-length list composed by the estimated posterior variance of study-specified factor;
hv - a S-length list composed by a M-length vector list corresponding to the posterior estimation of study-specified and modality variable-shared factor for each study and modality;
hzeta - the estimated posterior variance for study-specified and modality variable-shared factor;
hsigma2 - the estimated variance for study-specified and modality variable-shared factor;
hinvLambda - a S-length list composed by a M-length vector list corresponding to the inverse of the estimated variances of error;
S - the approximated posterior covariance for each row of F;
ELBO - the ELBO value when algorithm stops;
ELBO_seq - the sequence of ELBO values.
time_use - the running time in model fitting of SpaCOAP;

Arguments

seed: a postive integer, the random seed for reproducibility of data generation process.
nvec: a vector with postive integers, specify the sample size in each study/source.
pveclist: a named list, specify the number of modalities for each type and variable dimension in each type of modatlity.
q: a postive integer, specify the number of study-shared factors.
d: a postive integer, specify the dimension of covariate matrix.
qs: a vector with postive integers, specify the number of study-specified factors.
rho: a numeric vector with length(pveclist) and positive elements, specify the signal strength of loading matrices for each modality type.
rho_z: a positive real, specify the signal strength of covariates.
sigmavec: a positive real vector with length(pveclist), specify the variance of study-specified and modality variable-shared factors; default as 0.5 for each element.
n_bin: a positive integer, specify the number of trails when generate Binomial modality matrix; default as 1.
sigma_eps: a positive real, the variance of overdispersion error; default as 1.
heter_error: a logical value, whether to generate the heterogeneous error; default as FALSE.

Examples

Run this code

q <- 3; qsvec<-rep(2,3)
nvec <- c(100, 120, 100)
pveclist <-  list('gaussian'=rep(150, 1),'poisson'=rep(50, 2),'binomial'=rep(60, 2))
datlist <- gendata_mmgfm(seed = 1,  nvec = nvec, pveclist =pveclist,
                         q = q,  d= 3,qs = qsvec,  rho = rep(3,length(pveclist)), rho_z=0.5,
                         sigmavec=rep(0.5, length(pveclist)),  sigma_eps=1)

Run the code above in your browser using DataLab