Use DDT-LCM to estimate latent class and tree on class profiles for multivariate binary outcomes.
ddtlcm_fit(
K,
data,
item_membership_list,
total_iters = 5000,
initials = list(),
priors = list(),
controls = list(),
initialize_args = list(method_lcm = "random", method_dist = "euclidean", method_hclust
= "ward.D", method_add_root = "min_cor", alpha = 0, theta = 0)
)an object of class "ddt_lcm"; a list containing the following elements:
tree_samplesa list of information of the tree collected from the sampling algorithm, including:
accept: a binary vector where 1 indicates acceptance of the proposal tree and 0 indicates rejection.
tree_list: a list of posterior samples of the tree.
dist_mat_list: a list of tree-structured covariance matrices representing the marginal covariances
among the leaf parameters, integrating out the internal node parameters and all intermediate stochastic paths
in the DDT branching process.
response_probs_samplesa total_iters x K x J array of posterior samples of item response probabilities
class_probs_samplesa K x total_iters matrix of posterior samples of class probabilities
Z_samplesa N x total_iters integer matrix of posterior samples of individual class assignments
Sigma_by_group_samplesa G x total_iters matrix of posterior samples of diffusion variances
c_samplesa total_iters vector of posterior samples of divergence function hyperparameter
loglikelihooda total_iters vector of log-likelihoods of the full model
loglikelihood_lcma total_iters vector of log-likelihoods of the LCM model only
settinga list of model setup information, including: K, item_membership_list, and G
controlsa list of model controls, including:
fix_tree: FALSE to perform MH sampling of the tree, TRUE to fix the tree at the initial input.
c_order: a numeric value of 1 or 2 (see Arguments))
datathe input data matrix
number of classes (integer)
an NxJ matrix of multivariate binary responses, where N is the number of individuals, and J is the number of granular items
a list of G elements, where the g-th element contains the column
indices of data corresponding to items in major group g, and G is number of major item groups
number of posterior samples to collect (integer)
a named list of initial values of the following parameters:
tree_phylo4da phylo4d object. The initial tree have K leaves (labeled as "v1" through "vK"), 1 singleton root node (labeled as "u1"), and K-1 internal nodes (labeled as "u1" through \(u_{K-1}\)). The tree also contains parameters for the leaf nodes and the root node (which equals 0). The parameters for the internal nodes can be NAs because they will not be used in the algorithm.
response_proba K by J matrix with entries between 0 and 1. The initial values for the
item response probabilities. They should equal to the expit-transformed leaf parameters of tree_phylo4d.
class_probabilitya K-vector with entries between 0 and 1. The initial values for the class probabilities. Entries should be nonzero and sum up to 1, or otherwise will be normalized
class_assignmentsa N-vector with integer entries from 1, ..., K. The initial values for individual class assignments.
Sigma_by_groupa G-vector greater than 0. The initial values for the group-specific diffusion variances.
ca value greater than 0. The initial values for the group-specific diffusion variances.
Parameters not supplied with initial values will be initialized using the initialize function
with arguments in initialize_args.
a named list of values of hyperparameters of priors. See the function
initialize for explanation.
shape_sigmaa G-vector of positive values. The g-th element is the shape parameter for the inverse-Gamma prior on diffusion variance parameter sigma_g^2. Default is rep(2, G).
rate_sigmaa G-vector of positive values. Rate parameter. See above. Default is rep(2, G).
prior_dirichleta K-vector with entries positive entries. The parameter of the Dirichlet prior on class probability.
shape_ca positive value. The shape parameter for the Gamma prior on divergence function
hyperparameter c. Default is 1.
rate_ca positive value. The rate parameter for c. Default is 1.
a_pga positive value. The scale parameter for the generalized logistic distribution used in the augmented Gibbs sampler for leaf parameters. Default is 1, corresponding to the standard logistic distribution.
a named list of control variables.
fix_treea logical. If TRUE (default), the tree structure will be sampled in the algorithm. If FALSE,
the tree structure will be fixed at the initial input.
c_ordera numeric value. If 1, the divergence function is \(a(t) = c/(1-t)\). If 2, the divergence
function is \(a(t) = c/(1-t)^2\).
a named list of initialization arguments. See the function
initialize for explanation.
# load the MAP tree structure obtained from the real HCHS/SOL data
data(data_synthetic)
# extract elements into the global environment
list2env(setNames(data_synthetic, names(data_synthetic)), envir = globalenv())
# run DDT-LCM
result <- ddtlcm_fit(K = 3, data = response_matrix, item_membership_list, total_iters = 50)
Run the code above in your browser using DataLab