Simulates two high-dimensional omics datasets with customizable latent factor structures. Users can control the number and type of factors (shared, unique, mixed), the signal-to-noise ratio, and the distribution of signal-carrying samples and features. The function is flexible for benchmarking multi-omics integration methods under various controlled scenarios.
simulate_twoOmicsData(
vector_features = c(2000, 2000),
n_samples = 50,
n_factors = 3,
signal.samples = NULL,
signal.features.one = NULL,
signal.features.two = NULL,
num.factor = "multiple",
snr = 1,
advanced_dist = NULL,
...
)A numeric vector of length two, specifying the number of features in the first and second omics datasets, respectively.
Integer. The number of samples shared between both omics datasets.
Integer. Number of latent factors to simulate.
Optional numeric vector of length two: the first element is the mean, and the second is the variance of the number of signal-carrying samples per factor. If NULL, signal assignment is inferred from snr.
Optional numeric vector of length two: the first element is the mean, and the second is the variance of the number of signal-carrying features per factor in the first omic.
Optional numeric vector of length two: the first element is the mean, and the second is the variance of the number of signal-carrying features per factor in the second omic.
Character string. Either 'single' or 'multiple'. Determines whether to simulate a single latent factor or multiple factors.
Numeric. Signal-to-noise ratio used to estimate the background noise. The function uses this value to infer the proportion of signal versus noise in the simulated datasets.
Character string. Specifies how latent factors are distributed when num.factor = 'multiple'. Options include: '', NULL, 'mixed', 'omic.one', 'omic.two', or 'exclusive'.
Additional arguments (not currently used).