- x
design matrix of the target data set. Should be a matrix or data.frame object.
- fitted_bar
the output from mtlgmm function.
- step_size
step size choice in proximal gradient method to solve each optimization problem in the revised EM algorithm (Algorithm 1 in Tian, Y., Weng, H., & Feng, Y. (2022)), which can be either "lipschitz" or "fixed". Default = "lipschitz".
lipschitz: eta_w, eta_mu and eta_beta will be chosen by the Lipschitz property of the gradient of objective function (without the penalty part). See Section 4.2 of Parikh, N., & Boyd, S. (2014).
fixed: eta_w, eta_mu and eta_beta need to be specified
- eta_w
step size in the proximal gradient method to learn w (Step 3 of Algorithm 4 in Tian, Y., Weng, H., & Feng, Y. (2022)). Default: 0.1. Only used when step_size = "fixed".
- eta_mu
step size in the proximal gradient method to learn mu (Steps 4 and 5 of Algorithm 4 in Tian, Y., Weng, H., & Feng, Y. (2022)). Default: 0.1. Only used when step_size = "fixed".
- eta_beta
step size in the proximal gradient method to learn beta (Step 7 of Algorithm 4 in Tian, Y., Weng, H., & Feng, Y. (2022)). Default: 0.1. Only used when step_size = "fixed".
- lambda_choice
the choice of constants in the penalty parameter used in the optimization problems. See Algorithm 4 of Tian, Y., Weng, H., & Feng, Y. (2022), which can be either "fixed" or "cv". Default = "cv".
cv: cv_nfolds, cv_upper, and cv_length need to be specified. Then the C1 and C2 parameters will be chosen in all combinations in exp(seq(log(cv_lower/10), log(cv_upper/10), length.out = cv_length)) via cross-validation. Note that this is a two-dimensional cv process, because we set C1_w = C2_w, C1_mu = C1_beta = C2_mu = C2_beta to reduce the computational cost.
fixed: C1_w, C1_mu, C1_beta, C2_w, C2_mu, and C2_beta need to be specified. See equations (19)-(24) in Tian, Y., Weng, H., & Feng, Y. (2022).
- cv_nfolds
the number of cross-validation folds. Default: 5
- cv_upper
the upper bound of lambda values used in cross-validation. Default: 5
- cv_lower
the lower bound of lambda values used in cross-validation. Default: 0.01
- cv_length
the number of lambda values considered in cross-validation. Default: 5
- C1_w
the initial value of C1_w. See equations (19) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.05
- C1_mu
the initial value of C1_mu. See equations (20) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.2
- C1_beta
the initial value of C1_beta. See equations (21) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.2
- C2_w
the initial value of C2_w. See equations (22) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.05
- C2_mu
the initial value of C2_mu. See equations (23) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.2
- C2_beta
the initial value of C2_beta. See equations (24) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 0.2
- kappa0
the decaying rate used in equation (19)-(24) in Tian, Y., Weng, H., & Feng, Y. (2022). Default: 1/3
- tol
maximum tolerance in all optimization problems. If the difference between last update and the current update is less than this value, the iterations of optimization will stop. Default: 1e-05
- initial_method
initialization method. This indicates the method to initialize the estimates of GMM parameters for each data set. Can be either "kmeans" or "EM".
kmeans: the initial estimates of GMM parameters will be generated from the single-task k-means algorithm. Will call kmeans function in stats package.
EM: the initial estimates of GMM parameters will be generated from the single-task EM algorithm. Will call Mclust function in mclust package.
- iter_max
the maximum iteration number of the revised EM algorithm (i.e. the parameter T in Algorithm 1 in Tian, Y., Weng, H., & Feng, Y. (2022)). Default: 1000
- iter_max_prox
the maximum iteration number of the proximal gradient method. Default: 100
- ncores
the number of cores to use. Parallel computing is strongly suggested, specially when lambda_choice = "cv". Default: 1