
Main function for clustering functional data according to one or several of seven algorithms.
funcit(data, k, methods=c("fitfclust","distclust", "iterSubspace",
"funclust", "funHDDC", "fscm", "waveclust"), seed=NULL, regTime=NULL,
clusters=NULL, funcyCtrl=NULL, fpcCtrl=NULL, parallel=FALSE,
save.data=TRUE, ...)
Data in format "Format1" or format "Format2" (see formatFuncy
).
Number of clusters.
Model based cluster algorithm - based on a functional mixed mixture model. Allows irregular measurements, eigenbasis possible.
Cluster algorithm - based on a distance measure. Allows irregular measurements, eigenbasis possible.
Model based cluster algorithm - based on a subspace projection. Allows irregular measurements, eigenbasis possible, dimension between clusters can vary.
Model based cluster algorithm - based on a functional mixed mixture model.
Model based cluster algorithm - based on a functional mixed mixture model. Dimension between clusters can vary.
Model based cluster algorithm - based on a functional mixed
mixture model. Curves can dependent on location. A matrix
location
is then an optional input parameter (see
Details).
Model based cluster algorithm - based on a functional mixed mixture model. Wavelet basis is the only possible.
For a detailed description of the methods please see the references.
Seed for initial clustering. See funcyCtrl
.
If data is in "Format2", optional vector representing the
time points (see formatFuncy
). If regTime=NULL and format
="Format2", equidistant time points from 1 to number of curves are used.
Optional vector of true cluster labels.
A control object of class funcyCtrl
. If a model based
clustering algorithm is used, further parameters can be specified by
using the extended class fpcCtrlMbc
.
If TRUE
, package parallel is used for parallel
computing.
Save a copy of the data
in the return object? Must be set to TRUE
in order to use plot function plot
.
Additional optional model specific parameters. Works only if exactly one method
is called in methods
. The parameters are the
following:
Rank of the covariance matrix dimBase
.
Adds a ridge term to the least squares fit, helps if only few observations per curve were registered.
One of "hclust"
or "pam"
specifying how
distance matrix is processed.
FALSE
, if curve affiliation is tested
again by projecting the curve onto the current subspace
created without the current curve (leave-one-out-curve-estimation).
The number of small-EM used to determine the initialization of the main EM-like algorithm.
The maximum number of iterations for each small-EM.
The chosen model among "AkjBkQkDk", "AkjBQkDk", "AkBkQkDk","AkBQkDk","ABkQkDk","ABQkDk". See (Bouveyron & Jacques, 2011) for details.
A two-dimensional matrix of the curve locations (coordinates).
Number of neighbors each curve depends on.
"R"
or "C". If C is installed, a lot faster than
R.
TRUE
, if number of iterations and
sigma, theta and f are to be printed.
One of "group"
, "scale.location"
,
"group.scale.location"
or "constant"
.
One of "rEM"
or "SEM"
for random or
stochastic EM.
TRUE
, if log-likelihood is to be plotted.
Returns an object of class funcyOutList
.
funcit
is the core function to execute one or more methods to cluster functional
data. Functional data can be measured on a regular or on an irregular
grid. While for regular datasets, all curves are measured on the same
time points, for irregular datasets, number or/and location of time
points can differ (see formatFuncy
for different formats). Only algorithms "fitfclust"
,"distclust"
and
"iterSubspace"
are applicable to irregular datasets.
All methods are based on the projection of the curves onto a
basis defined in funcyCtrl
and building mixed effects
models of the basis coefficients.
Christina Yassouridis and Dominik Ernst and Friedrich Leisch. Generalization, Combination and Extension of Functional Clustering Algorithms: The R Package funcy. Journal of Statistical Software. 85 (9). 1--25. 2018
Gareth James and Catherine A. Sugar. Clustering for Sparsely Sampled Functional Data. Journal of the American Statistical Association. 98 (462). 297--408. 2003
Jie Peng and Hans-Georg Mueller. Distance-based clustering of sparsely observed stochastic processes, with applications to online auctions. The Annals of Applied Statistics. 2 (3). 1056--1077, 2008
Chiou Jeng-Min and Pai-Ling Li. Functional clustering and identifying substructures of longitudinal data. Journal of the Royal Statistical Society: Series B. 69 (4). 679--699. 2007
Madison Giacofci and Sophie Lambert-Lacroix and Guillemette Marot and Franck Picard. Wavelet-based clustering for mixed-effects functional models in high dimension. Biometrics. 69. 31--40. 2011
Nicoleta Serban and Huijing Jiang.Clustering Random Curves Under Spatial Interdependence With Application to Service Accessibility. Technometrics. 54 (2). 108--119. 2012
Julien Jacques and Cristian Preda. Funclust: a curves clustering method using functional random variables density approximation. Neurocomputing. 112. 164<U+2013>171. 2013
Charles Bouveyron and Julien and Jacques. Model-based clustering of time series in group-specific functional subspaces. Advances in Data Analysis and Classification. 5 (4). 281--300. 2011
# NOT RUN {
##Cluster the data with methods for regular sets
##Sample a regular dataset
set.seed(2804)
ds <- sampleFuncy(obsNr=50, k=4, timeNr=8, reg=TRUE)
##Cluster the functions with all available methods.
res <- funcit(data=Data(ds), clusters=Cluster(ds),
methods=c(1,2,3), seed=2404,
k=4)
summary(res)
Cluster(res)
##Additional method specific parameters for method fitfclust
res <- funcit(data=Data(ds), clusters=Cluster(ds), methods="fitfclust", seed=2405,
k=4, p=5, pert=0)
##Cluster the data with methods for irregular sets
##Sample an irregular dataset
set.seed(2804)
ds <- sampleFuncy(obsNr=50, k=4, timeNrMin=4, timeNrMax=8,
reg=FALSE)
data <- Data(ds)
clusters <- Cluster(ds)
res <- funcit(data=data, clusters=clusters,
methods=c("fitfclust","distclust", "iterSubspace"), seed=2406,
k=4, parallel=TRUE)
summary(res)
Cluster(res)
plot(res)
##Two reallife examples
# }
# NOT RUN {
data("genes")
data <- genes$data
clusters <- genes$clusters
##Cluster the functions with all available methods.
res <- funcit(data=data, clusters=clusters,
methods=c(1:7)[-4], seed=2404,
k=4)
summary(res)
Cluster(res)
# }
# NOT RUN {
# }
# NOT RUN {
data("electricity")
res <- funcit(data=electricity, methods=c("fitfclust","distclust",
"waveclust"), seed=2406, k=5, parallel=TRUE)
plot(res, legendPlace="topleft")
# }
Run the code above in your browser using DataLab