Learn R Programming

clue (version 0.3-27)

l1_fit_ultrametric: Least Absolute Deviation Fit of Ultrametrics to Dissimilarities

Description

Find the ultrametric with minimal absolute distance (Manhattan dissimilarity) to a given dissimilarity object.

Usage

l1_fit_ultrametric(x, method = c("SUMT", "IRIP"), weights = 1,
                   control = list())

Arguments

x
a dissimilarity object inheriting from or coercible to class "dist".
method
a character string indicating the fitting method to be employed. Must be one of "SUMT" (default) or "IRIP", or a unique abbreviation thereof.
weights
a numeric vector or matrix with non-negative weights for obtaining a weighted least squares fit. If a matrix, its numbers of rows and columns must be the same as the number of objects in x, and the lower diagonal part is used.
control
a list of control parameters. See Details.

Value

  • An object of class "cl_ultrametric" containing the fitted ultrametric distances.

Details

The problem to be solved is minimizing $$L(u) = \sum_{i,j} w_{ij} |x_{ij} - u_{ij}|$$ over all $u$ satisfying the ultrametric constraints (i.e., for all $i, j, k$, $u_{ij} \le \max(u_{ik}, u_{jk})$). This problem is known to be NP hard (Krivanek and Moravek, 1986).

We provide two heuristics for solving this problem.

Method "SUMT" implements a SUMT (Sequential Unconstrained Minimization Technique, see sumt) approach using the sign function for the gradients of the absolute value function.

Availabe control parameters are method, control, eps, q, and verbose, which have the same roles as for sumt, and the following.

[object Object],[object Object]

Method "IRIP" implements a variant of the Iteratively Reweighted Iterative Projection approach of Smith (2001), which attempts to solve the $L_1$ problem via a sequence of weighted $L_2$ problems, determining $u(t+1)$ by minimizing the criterion function $$\sum_{i,j} w_{ij} (x_{ij} - u_{ij})^2 / \max(|x_{ij} - u_{ij}(t)|, m)$$ with $m$ a small non-zero value to avoid zero divisors. We use the SUMT method of ls_fit_ultrametric for solving the weighted least squares problems.

Available control parameters are as follows. [object Object],[object Object],[object Object],[object Object],[object Object],[object Object] One may need to adjust the default control parameters to achieve convergence.

It should be noted that all methods are heuristics which can not be guaranteed to find the global minimum.

References

M. Krivanek and J. Moravek (1986). NP-hard problems in hierarchical tree clustering. Acta Informatica, 23, 311--323.

T. J. Smith (2001). Constructing ultrametric and additive trees based on the $L_1$ norm. Journal of Classification, 18, 185--207.

See Also

cl_consensus for computing least absolute deviation (Manhattan) consensus hierarchies; ls_fit_ultrametric.