scclust (version 0.2.4)

check_clustering: Check clustering constraints

Description

check_clustering checks whether a clustering satisfies constraints on the size and composition of the clusters.

Usage

check_clustering(
  clustering,
  size_constraint = NULL,
  type_labels = NULL,
  type_constraints = NULL,
  primary_data_points = NULL
)

Value

Returns TRUE if clustering satisfies the constraints, and

FALSE if it does not. Throws an error if clustering is an invalid instance of the scclust class.

Arguments

clustering

a scclust object containing a non-empty clustering.

size_constraint

an integer with the required minimum cluster size. If NULL, only the type constraints will be checked.

type_labels

a vector containing the type of each data point. May be NULL when type_constraints is NULL.

type_constraints

a named integer vector containing type-specific size constraints. If NULL, only the overall constraint will be checked.

primary_data_points

a vector specifying primary data points, either by point indices or with a logical vector of length equal to the number of points. check_clustering checks so all primary data points are assigned to a cluster. NULL indicates that no such check should be done.

See Also

See sc_clustering for details on how to specify the type_labels and type_constraints parameters.

Examples

Run this code
# Example scclust clustering
my_scclust <- scclust(c("A", "A", "B", "C", "B",
                        "C", "C", "A", "B", "B"))


# Check so each cluster contains at least two data points
check_clustering(my_scclust, 2)
# > TRUE


# Check so each cluster contains at least four data points
check_clustering(my_scclust, 4)
# > FALSE


# Data point types
my_types <- factor(c("x", "y", "y", "z", "z",
                     "x", "y", "z", "x", "x"))


# Check so each cluster contains at least one point of each type
check_clustering(my_scclust,
                 NULL,
                 my_types,
                 c("x" = 1, "y" = 1, "z" = 1))
# > TRUE


# Check so each cluster contains one data point of both "x" and "z"
# and at least three points in total
check_clustering(my_scclust,
                 3,
                 my_types,
                 c("x" = 1, "z" = 1))
# > TRUE


# Check so each cluster contains five data points of type "y"
check_clustering(my_scclust,
                 NULL,
                 my_types,
                 c("y" = 5))
# > FALSE

Run the code above in your browser using DataCamp Workspace