Given a phytosociological table (m_bin, rows corresponding to
taxa and columns corresponding to relevés) this function searches for
a k-partition (k defined by the user) optimizing TDV, i.e., searches,
using a Hill-climbing algorithm, for patterns of differential taxa by
rearranging the relevés into k groups.
The optimization can start from a random partition (p_ini = "random"), or
from a given partition (p_ini, defined by the user or produced by any
clustering method, or even a manual classification of the relevés).
In the description given below, a 1-neighbour of a given partition is
another partition that can be obtained by simply changing one relevé to a
different group. Equivalently a n-neighbour of a given partition is
another partition obtained ascribing n relevés to different groups.
This function implements a Hill-climbing algorithm, where a TDV improvement
is searched in each iteration, screening all 1-neighbours, until the given
number of maximum iterations (maxit) is reached. If maxit is not
reached but no TDV improvement is possible among all the 1-neighbours of
the currently best partition, the search is halted and the current
partition is tagged as a local maximum and outputted.
As the screening of all 1-neighbours might be computationally heavy,
specially while analysing big tables, optionally, a Stochastic
Hill-climbing search can be performed as a first step
(stoch_first = TRUE). This consists in searching for TDV improvements, by
randomly selecting, in each iteration, one n-neighbour (n defined by
the user in the parameter stoch_neigh_size), and accepting that
n-neighbour partition immediately if it improves TDV. This is repeated
until a given number of iterations (stoch_maxit) is reached. Specially
while starting from random partitions, Stochastic Hill-climbing is intended
to increase TDV without the computational burden of the full neighbourhood
screening, which can be done afterwards, in a second step.
The Hill-climbing or the combination of Stochastic Hill-climbing +
Hill-climbing, can be run multiple times by the function (defined in
n_runs), which consists in a Random-restart Hill-climbing, where n_sol
best solutions are kept and returned.
As the Hill-climbing algorithm converges easily to local maxima, several
runs of the function (i.e., multiple random starts) are advised.
Trimming your table by a 'constancy' range or using the result of other
cluster methodologies as input, might help finding interesting partitions.
However, after trimming the table by a narrow 'constancy' range, getting a
random initial partition with TDV greater than zero might be hard; on such
cases using a initial partition from partition_tdv_grasp() or
partition_tdv_grdtp(), or even the result of other clustering
strategies, as an input partition might be useful.