Learn R Programming

clst (version 1.20.0)

findThreshold: findThreshold

Description

Identify a distance threshold predicting whether a pairwise distance represents a comparison between objects in the same class (within-group comparison) or different classes (between-group comparison) given a matrix providing distances between objects and the group membership of each object.

Usage

findThreshold(dmat, groups, distances, method = "mutinfo", prob = 0.5, na.rm = FALSE, keep.dists = TRUE, roundCuts = 2, minCuts = 20, maxCuts = 300, targetCuts = 100, verbose = FALSE, depth = 1, ...)
partition(dmat, groups, include, verbose = FALSE)

Arguments

dmat
Square matrix of pairwise distances.
groups
Object coercible to a factor identifying group membership of objects corresponding to either edge of dmat.
include
vector (numeric or boolean) indicating which elements to retain in the output; comparisons including an excluded element will have a value of NA
distances
Optional output of partition provided in the place of dmat and groups
method
The method for calculating the threshold; only 'mutinfo' is currently implemented.
prob
Sets the upper and lower bounds of D as some quantile of the within class distances and between-class differences, respectively.
na.rm
If TRUE, excludes NA elements in groups and corresponding rows and columns in dmat. Ignored if distances is provided.
keep.dists
If TRUE, the output will contain the distances element (output of partition).
roundCuts
Number of digits to round cutoff values (see Details)
minCuts
Minimal length of vector of cutoffs (see Details).
maxCuts
Maximal length of vector of cutoffs (see Details)
targetCuts
Length of vector of cutoffs if conditions met by minCuts and maxCuts are not met (see Details).
verbose
Terminal output is produced if TRUE.
depth
Private argument used to track level of recursion.
...
Extra arguments are ignored.

Value

In the case of findThreshold, output is a list with elements decsribed below. In the case of partition, output is the data.frame returned as the element named $distances in the output of findThreshold.
D
The distance threshold (distance cutoff corresponding to the PMMI).
pmmi
Value of the point of maximal mutual information (PMMI)
interval
A vector of length 2 indicating the upper and lower bounds over which values for the threshold are evaluated.
breaks
A data.frame with columns x and y providing candidiate breakpoints and corresponding mutual information values, respectively.
distances
If keep.distances is TRUE, a data.frame containing pairwise distances identified as within- or between classes.
method
Character corresponding to input argument method.
params
Additional input parameters.

Details

findThreshold is used internally in classify, but may also be used to calculate a starting value of $D$.

partition is used to transform a square (or lower triangular) distance matrix into a data.frame containing a column of distances ($vals) along with a factor ($comparison) defining each distance as a within- or between-group comparison. Columns $row and $col provide indices of corresponding rows and columns of dmat.

See Also

plotDistances, plotMutinfo

Examples

Run this code
data(iris)
dmat <- as.matrix(dist(iris[,1:4], method="euclidean"))
groups <- iris$Species
thresh <- findThreshold(dmat, groups, type="mutinfo")
str(thresh)

Run the code above in your browser using DataLab