Learn R Programming

clue (version 0.3-67)

ls_fit_addtree: Least Squares Fit of Additive Tree Distances to Dissimilarities

Description

Find the additive tree distance or centroid distance minimizing least squares distance (Euclidean dissimilarity) to a given dissimilarity object.

Usage

ls_fit_addtree(x, method = c("SUMT", "IP", "IR"), weights = 1,
               control = list())
ls_fit_centroid(x)

Value

An object of class "cl_addtree" containing the optimal additive tree distances.

Arguments

x

a dissimilarity object inheriting from class "dist".

method

a character string indicating the fitting method to be employed. Must be one of "SUMT" (default), "IP", or "IR", or a unique abbreviation thereof.

weights

a numeric vector or matrix with non-negative weights for obtaining a weighted least squares fit. If a matrix, its numbers of rows and columns must be the same as the number of objects in x, and the lower diagonal part is used. Otherwise, it is recycled to the number of elements in x.

control

a list of control parameters. See Details.

Details

See as.cl_addtree for details on additive tree distances and centroid distances.

With \(L(d) = \sum w_{ij} (x_{ij} - d_{ij})^2\), the problem to be solved by ls_fit_addtree is minimizing \(L\) over all additive tree distances \(d\). This problem is known to be NP hard.

We provide three heuristics for solving this problem.

Method "SUMT" implements the SUMT Sequential Unconstrained Minimization Technique|Fiacco+McCormick:1968| approach of De_Soete:1983. Incomplete dissimilarities are currently not supported.

Methods "IP" and "IR" implement the Iterative Projection and Iterative Reduction approaches of Hubert+Arabie:1995 and Roux:1988, respectively. Non-identical weights and incomplete dissimilarities are currently not supported.

See ls_fit_ultrametric for details on these methods and available control parameters.

It should be noted that all methods are heuristics which can not be guaranteed to find the global minimum. Standard practice would recommend to use the best solution found in “sufficiently many” replications of the base algorithm.

ls_fit_centroid finds the centroid distance \(d\) minimizing \(L(d)\) (currently, only for the case of identical weights). This optimization problem has a closed-form solution.

References

Fiacco+McCormick:1968, Hubert+Arabie:1995, Roux:1988, De_Soete:1983