clus.torus
returns clustering results of data on the torus based on
inductive conformal prediction set
clus.torus(
data,
split.id = NULL,
model = c("kmeans", "mixture"),
mixturefitmethod = c("axis-aligned", "circular", "general"),
kmeansfitmethod = c("general", "homogeneous-circular", "heterogeneous-circular",
"ellipsoids"),
J = NULL,
level = NULL,
option = NULL,
verbose = TRUE,
...
)# S3 method for clus.torus
plot(
x,
panel = 1,
assignment = "outlier",
data = NULL,
ellipse = TRUE,
type = NULL,
overlay = FALSE,
out = FALSE,
...
)
n x d matrix of toroidal data on \([0, 2\pi)^d\)
or \([-\pi, \pi)^d\). Default is NULL
.
a n-dimensional vector consisting of values 1 (estimation) and 2(evaluation)
A string. One of "mixture" and "kmeans" which
determines the model or estimation methods. If "mixture", the model is based
on the von Mises mixture, fitted
with an EM algorithm. It supports the von Mises mixture and its variants
based conformity scores. If "kmeans", the model is also based on the von
Mises mixture, but the parameter estimation is implemented with the
elliptical k-means algorithm. It supports the
log-max-mixture based conformity score only. If the
dimension of data space is greater than 2, only "kmeans" is supported.
Default is model = "kmeans"
.
A string. One of "circular", "axis-aligned", and
"general" which determines the constraint of the EM fitting. Default is
"axis-aligned". This argument only works for model = "mixture"
.
A string. One of "general", ellipsoids",
"heterogeneous-circular" or "homogeneous-circular". If "general", the
elliptical k-means algorithm with no constraint is used. If "ellipsoids",
only the one iteration of the algorithm is used. If"heterogeneous-circular",
the same as above, but with the constraint that ellipsoids must be spheres.
If "homogeneous-circular", the same as above but the radii of the spheres are
identical. Default is "general". This argument only works for model = "kmeans"
.
the number of components for mixture model fitting. If J
is a vector,
then hyperparam.torus
is used to choose optimal J
. If
J == NULL
, then J = 4:30
is used.
a scalar in \([0,1]\). The level of the conformal prediction set
used for clustering. If level == NULL
, then hyperparam.alpha
is
used to choose optimal level
A string. One of "elbow", "risk", "AIC", or "BIC", which determines the
criterion for the model selection. "risk" is based on the negative log-likelihood, "AIC"
for the Akaike Information Criterion, and "BIC" for the Bayesian Information Criterion.
"elbow" is based on minimizing the criterion used in Jung et. al.(2021).
This argument is only used if J
is a vector or NULL
.
boolean index, which indicates whether display
additional details as to what the algorithm is doing or
how many loops are done. Default is TRUE
.
Further arguments that will be passed to icp.torus
and
hyperparam.torus
clus.torus
object
One of 1 or 2 which determines the type of plot. If panel = 1
,
x$cluster.obj
will be plotted, if panel = 2
, x$icp.torus
will be plotted.
If panel = 3
, x$hyperparam.select
will be plotted. Default is panel = 1
.
A string. One of "outlier", "log.density", "posterior", "mahalanobis". Default is "outlier".
A boolean index which determines whether plotting ellipse-intersections. Default is TRUE
. Only available
for panel = 2
.
A string. One of "mix", "max" or "e". This argument is only available if icp.torus
object is fitted with model = "mixture"
. Default is NULL
. If type != NULL
, argument
ellipse
automatically becomes FALSE
. If "mix", it plots based on von Mises mixture.
If "max", it plots based on von Mises max-mixture. If "e", it plots based on ellipse-approximation.
A boolean index which determines whether plotting ellipse-intersections on clustering plots. Default is FALSE
.
Only available for panel = 1
.
An option for returning the ggplot object. Default is FALSE
.
clus.torus
returns a clus.torus
object, which consists of following 3 different S3 objects;
cluster.obj
cluster.obj
object; clustering assignment results for
several methods. For detail, see cluster.assign.torus
.
icp.torus
icp.torus
object; containing model parameters and
conformity scores. For detail, see icp.torus
.
hyperparam.select
hyperparam.torus
object (if J = NULL
or a
sequence of numbers, and level = NULL
or a sequence of numbers), hyperparam.J
object (if level
is a scalar), or hyperparam.alpha
object (if J
is a scalar);
contains information for the optimally chosen model (number of components J) and level (alpha)
based on prespecified criterion. For detail, see hyperparam.torus
, hyperparam.J
, and hyperparam.alpha
.
clus.torus
is a user-friendly all-in-one function which implements following
procedures automatically: 1. compute conformity scores for given model and fitting method,
2. choose optimal model and level based on prespecified criterion, and
3. make clusters based on the chosen model and level. Procedure 1-3 can be
independently done with icp.torus
, hyperparam.torus
,
hyperparam.J
, hyperparam.alpha
and cluster.assign.torus
.
If you want to see more detail for each procedure, please see
icp.torus
, hyperparam.J
, hyperparam.alpha
hyperparam.torus
, cluster.assign.torus
.
Jung, S., Park, K., & Kim, B. (2021). Clustering on the torus by conformal prediction. The Annals of Applied Statistics, 15(4), 1583-1603.
Mardia, K. V., Kent, J. T., Zhang, Z., Taylor, C. C., & Hamelryck, T. (2012). Mixtures of concentrated multivariate sine distributions with applications to bioinformatics. Journal of Applied Statistics, 39(11), 2475-2492.
Shin, J., Rinaldo, A., & Wasserman, L. (2019). Predictive clustering. arXiv preprint arXiv:1903.08125.
icp.torus
, hyperparam.torus
,
hyperparam.J
, hyperparam.alpha
cluster.assign.torus
# NOT RUN {
data <- toydata2[, 1:2]
n <- nrow(data)
clus.torus(data = data, model = "kmeans", kmeansfitmethod = "general", J = 5:30, option = "risk")
# }
Run the code above in your browser using DataLab