Generate data for simulations. All models used in Tian, Y., Weng, H., & Feng, Y. (2022)) are implemented.
data_generation(
K = 10,
outlier_K = 1,
simulation_no = c("MTL-1", "MTL-2"),
h_w = 0.1,
h_mu = 1,
n = 50
)a list of two sub-lists "data" and "parameter". List "data" contains a list of design matrices x, a list of hidden labels y, and a vector of outlier task indices outlier_index. List "parameter" contains a vector w of mixture proportions, a matrix mu1 of which each column is the GMM mean of the first cluster of each task, a matrix mu2 of which each column is the GMM mean of the second cluster of each task, a matrix beta of which each column is the discriminant coefficient in each task, a list Sigma of covariance matrices for each task.
the number of tasks (data sets). Default: 10
the number of outlier tasks. Default: 1
simulation number in Tian, Y., Weng, H., & Feng, Y. (2022)). Can be "MTL-1", "MTL-2". Default = "MTL-1".
the value of h_w. Default: 0.1
the value of h_mu. Default: 1
the sample size of each task. Can be either an positive integer or a vector of length K. If it is an integer, then the sample size of all tasks will be the same and equal to n. If it is a vector, then the k-th number will be the sample size of the k-th task. Default: 50.
Tian, Y., Weng, H., & Feng, Y. (2022). Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models. arXiv preprint arXiv:2209.15224.
mtlgmm, tlgmm, predict_gmm, initialize, alignment, alignment_swap, estimation_error, misclustering_error.
data_list <- data_generation(K = 5, outlier_K = 1, simulation_no = "MTL-1", h_w = 0.1,
h_mu = 1, n = 50)
Run the code above in your browser using DataLab