Doubly-enhanced EM algorithm for tensor clustering
DEEM(X, nclass, niter = 100, lambda = NULL, dfmax = n, pmax = nvars, pf = rep(1, nvars),
eps = 1e-04, maxit = 1e+05, sml = 1e-06, verbose = FALSE, ceps = 0.1,
initial = TRUE, vec_x = NULL)
Input tensor (or matrix) list of length \(n\), where \(n\) is the number of observations. Each element of the list is a tensor or matrix. The order of tensor can be any positive integer not less than 2.
Number of clusters.
Maximum iteration times for EM algorithm. Default value is 100.
A user-specified lambda
value. lambda
is the weight of L1 penalty and a smaller lambda
allows more variables to be nonzero
The maximum number of selected variables in the model. Default is the number of observations n
.
The maximum number of potential selected variables during iteration. In middle step, the algorithm can select at most pmax
variables and then shrink part of them such that the number of final selected variables is less than dfmax
.
Weight of lasso penalty. Default is a vector of value 1
and length p
, representing L1 penalty of length \(p\). Can be modified to use adaptive lasso penalty.
Convergence threshold for coordinate descent difference between iterations. Default value is 1e-04
.
Maximum iteration times for coordinate descent for all lambda. Default value is 1e+05
.
Threshold for ratio of loss function change after each iteration to old loss function value. Default value is 1e-06
.
Indicates whether print out lambda during iteration or not. Default value is FALSE
.
Convergence threshold for cluster mean difference between iterations. Default value is 1
.
Whether to initialize algorithm with K-means clustering. Default value is TRUE
.
Vectorized tensor data. Default value is NULL
A vector of estimated prior probabilities for clusters.
A list of estimated cluster means.
A list of estimated covariance matrices.
A n
by nclass
matrix of estimated membership weights.
A vector of estimated labels.
Number of iterations until convergence.
Average zero elements in beta over iterations.
A matrix of vectorized B_k
.
The DEEM
function implements the Doubly-Enhanced EM algorithm (DEEM) for tensor clustering. The observations \(\mathbf{X}_i\) are assumed to be following the tensor normal mixture model (TNMM) with common covariances across different clusters:
$$
\mathbf{X}_i\sim\sum_{k=1}^K\pi_k \mathrm{TN}(\bm{\mu}_k;\bm{\Sigma}_1,\ldots,\bm{\Sigma}_M),\quad i=1,\dots,n,
$$
where \(0<\pi_k<1\) is the prior probability for \(\mathbf{X}\) to be in the \(k\)-th cluster such that \(\sum_{k=1}^{K}\pi_k=1\), \(\bm{\mu}_k\) is the cluster mean of the \(k\)-th cluster and \(\bm{\Sigma}_1,\ldots,\bm{\Sigma}_M)\) are the common covariances across different clusters. Under the TNMM framework, the optimal clustering rule can be showed as
$$
\widehat{Y}^{opt}=\arg\max_k\{\log\pi_k+\langle\mathbf{X}-(\bm{\mu}_1+\bm{\mu}_k)/2,\mathbf{B}_k\rangle\},
$$
where \(\mathbf{B}_k=[\![\bm{\mu}_k-\bm{\mu}_1;\bm{\Sigma}_1^{-1},\ldots,\bm{\Sigma}_M^{-1}]\!]\). In the enhanced E-step, DEEM
imposes sparsity directly on the optimal clustering rule as a flexible alternative to popular low-rank assumptions on tensor coefficients \(\mathbf{B}_k\) as
$$
\min_{\mathbf{B}_2,\dots,\mathbf{B}_K}\bigg[\sum_{k=2}^K(\langle\mathbf{B}_k,[\![\mathbf{B}_k,\widehat{\bm{\Sigma}}_1^{(t)},\ldots,\widehat{\bm{\Sigma}}_M^{(t)}]\!]\rangle-2\langle\mathbf{B}_k,\widehat{\bm{\mu}}_k^{(t)}-\widehat{\bm{\mu}}_1^{(t)}\rangle) +\lambda^{(t+1)}\sum_{\mathcal{J}}\sqrt{\sum_{k=2}^Kb_{k,\mathcal{J}}^2}\bigg],
$$
where \(\lambda^{(t+1)}\) is a tuning parameter. In the enhanced M-step, DEEM
employs a new estimator for the tensor correlation structure, which facilitates both the computation and the theoretical studies.
Mai, Q., Zhang, X., Pan, Y. and Deng, K. (2021). A Doubly-Enhanced EM Algorithm for Model-Based Tensor Clustering. Journal of the American Statistical Association.
# NOT RUN {
dimen = c(5,5,5)
nvars = prod(dimen)
K = 2
n = 100
sigma = array(list(),3)
sigma[[1]] = sigma[[2]] = sigma[[3]] = diag(5)
B2=array(0,dim=dimen)
B2[1:3,1,1]=2
y = c(rep(1,50),rep(2,50))
M = array(list(),K)
M[[1]] = array(0,dim=dimen)
M[[2]] = B2
vec_x=matrix(rnorm(n*prod(dimen)),ncol=n)
X=array(list(),n)
for (i in 1:n){
X[[i]] = array(vec_x[,i],dim=dimen)
X[[i]] = M[[y[i]]] + X[[i]]
}
myfit = DEEM(X, nclass=2, lambda=0.05)
# }
Run the code above in your browser using DataLab