Last chance! 50% off unlimited learning
Sale ends in
cl_median(x, method = NULL, weights = 1, control = list())
cl_ensemble
).NULL
(default value). If a character
string, its lower-cased version is matched against the lx
if necessary.cl_dissimilarity
) over all soft partitions with $k$
classes, where $w_b$ is the case weight given to element $x_b$
of the ensemble. Available methods are as follows.
[object Object],[object Object],[object Object]
By default, method "DWH"
is used.
If all elements of the ensemble are hierarchies, the built-in method
(named "cophenetic"
) for computing medians is based on
minimizing $L(u) = \sum w_b d(x_b, u)$ over all ultrametrics,
where $d$ is Euclidean dissimilarity. This is equivalent to
finding the best least squares ultrametric approximation of the
weighted average $d = \sum w_b u_b$ of the ultrametrics $u_b$
of the hierarchies $x_b$, which is attempted by calling
ls_fit_ultrametric
on $d$ with appropriate control
parameters.
If a user-defined agreement method is to be employed, it must be a function taking the cluster ensemble, the case weights, and a list of control parameters as its arguments.
All built-in methods use heuristics for solving hard optimization
problems, and cannot be guaranteed to find a global minimum. Standard
practice would recommend to use the best solution found in
A. D. Gordon and M. Vichi (2001). Fuzzy partition models for fitting a set of partitions. Psychometrika, 66, 229--248.
A. D. Gordon (1999). Classification (2nd edition). Boca Raton, FL: Chapman & Hall/CRC.
cl_medoid
## Median partition for the Rosenberg-Kim kinship terms partition
## data based on co-membership dissimilarities.
data("Kinship82")
m1 <- cl_median(Kinship82, method = "GV3",
control = list(k = 3, verbose = TRUE))
## (Note that one should really use several replicates of this.)
## Total co-membership dissimilarity:
sum(cl_dissimilarity(Kinship82, m1, "comem"))
## Compare to the consensus solution given in Gordon & Vichi (2001).
data("Kinship82_Consensus")
m2 <- Kinship82_Consensus[["JMF"]]
sum(cl_dissimilarity(Kinship82, m2, "comem"))
## Seems we get a better solution ...
## How dissimilar are these solutions?
cl_dissimilarity(m1, m2, "comem")
## How "fuzzy" are they?
cl_fuzziness(cl_ensemble(m1, m2))
## Do the "nearest" hard partitions fully agree?
cl_dissimilarity(as.cl_hard_partition(m1),
as.cl_hard_partition(m2))
## Hmm ...
## Median partition for the Gordon and Vichi (2001) macroeconomic
## partition data based on Euclidean dissimilarities.
data("Macro")
set.seed(1)
m1 <- cl_median(Macro, method = "GV1",
control = list(k = 2, verbose = TRUE))
## (Note that one should really use several replicates of this.)
## Total Euclidean dissimilarity:
sum(cl_dissimilarity(Macro, m1))
## Compare to the consensus solution given in Gordon & Vichi (2001).
data("Macro_Consensus")
m2 <- Macro_Consensus[["MF1"]]
sum(cl_dissimilarity(Macro, m2))
## Seems we get a better solution ...
## And in fact, it is qualitatively different:
table(as.cl_hard_partition(m1),
as.cl_hard_partition(m2))
## Hmm ...
Run the code above in your browser using DataLab