Identifies invariant coordinates associated to the highest discriminatory
power. Currently, the implemented measure is "eta2" as quantified by the
Wilks' partial eta-squared, computed using the heplots::etasq()
function.
discriminatory_crit(object, ...)# S3 method for ICS
discriminatory_crit(
object,
clusters,
method = "eta2",
nb_select = NULL,
select_only = FALSE,
...
)
# S3 method for default
discriminatory_crit(
object,
clusters,
method = "eta2",
nb_select = NULL,
select_only = FALSE,
gen_kurtosis = NULL,
...
)
If select_only is TRUE a vector of the names of the invariant
components or variables to select.
If FALSE an object of class "ICS_crit"
is returned with the following objects:
crit: the name of the criterion "discriminatory".
method: the name of the discriminatory power.
nb_select: the number of components to select.
select: the names of the invariant components or variables to select.
power_combinations: the discriminatory values for each of the considered
combinations of nb_select components.
gen_kurtosis: the vector of generalized kurtosis values in case of
ICS object.
dataframe or object of class "ICS".
additional arguments are currently ignored.
a vector of the same length as the number of observations, indicating the true clusters. It is used to compute the discriminatory power based on it.
the name of the discriminatory power.
Only "eta2" is implemented.
the exact number of components to select.
By default it is set to NULL, i.e the number
of components to select is the number of clusters minus one.
boolean. If TRUE only the vector names of the selected
invariant components are returned. If FALSE additional details are returned.
vector of generalized kurtosis values.
Aurore Archimbaud and Anne Ruiz-Gazen
The discriminatory power is evaluated for each combination of the
first and/or last combinations of nb_select components. The combination
achieving the highest discriminatory power is selected.
More specifically, we compute \(\eta^{2} = 1 - \Lambda^{1/s}\), where \(\Lambda\)
denotes Wilks' lambda:
$$
\Lambda = \frac{\det(E)}{\det(T)},
$$
where \(E\) is the within-group sum of squares and cross-products matrix,
\(H\) is the between-group sum of squares and cross-products matrix and
\(T\) is the total sum of squares and cross-products matrix, with
\(T = H + E\), \(s=min(p, df_h)\) with \(p\) being the number of
latent roots of \(HE^{-1}\). See heplots::etasq() for more details.
Alfons, A., Archimbaud, A., Nordhausen, K., & Ruiz-Gazen, A. (2024). Tandem clustering with invariant coordinate selection. Econometrics and Statistics. tools:::Rd_expr_doi("10.1016/j.ecosta.2024.03.002").
Muller, K. E. and Peterson, B. L. (1984). Practical methods for computing power in testing the Multivariate General Linear Hypothesis Computational Statistics and Data Analysis, 2, 143-158.
normal_crit(), med_crit(), var_crit(), heplots::etasq().
X <- iris[,-5]
out <- ICS(X)
discriminatory_crit(out, clusters = iris[,5], select_only = FALSE)
Run the code above in your browser using DataLab