Learn R Programming

analogue (version 0.4-0)

roc: ROC curve analysis

Description

Fits Receiver Operator Characteristic (ROC) curves to training set data. Used to determine the critical value of a dissimilarity coefficient that best descriminate between assemblage-types in palaeoecological data sets, whilst minimising the false positive error rate (FPF).

Usage

roc(object, groups, ...)

## S3 method for class 'mat': roc(object, groups, ...)

## S3 method for class 'analog': roc(object, groups, ...)

Arguments

object
an R object.
groups
numeric; a vector of group memberships, one entry per sample in the training set data.
...
arguments passed to/from other methods.

Value

  • A list, with the following components:
  • TPFThe true positive fraction.
  • FPEThe false positive error
  • roc.pointsThe unique dissimilarities at which the ROC curve was evaluated
  • roc.valuesThe difference between TPF and FPE at each evaluated point of the ROC curve.
  • optimalThe optimal dissimilarity value, asessed where roc.values is maximal.
  • wilcoxan object of class "htest", the result of a call to wilcox.test. Contains the results of a Wilcoxon Rank Sum and Signed Rank test applied to the within and between group dissimilarities.
  • AUCThe area under the ROC curve.
  • n.withinnumeric; the number of within group dissimilarities.
  • n.withoutnumeric; the number of outside of group dissimilarities.
  • groupsnumeric; the group membership
  • dissimsnumeric; vector of observed dissimilarities.
  • methodcharacter; the dissimilarity coefficient used. Taken from "object".
  • callthe matched call.

concept

ROC

Details

A ROC curve is generated from the within-group and between-group dissimilarities.

Within-group dissimilarities are the cells in the lower-triangle dissimilarity matrix representing the pairwise dissimilarities for samples amples in the same group, over all groups.

The between-group dissimilarities are the cells in the lower-triangle dissimilarity matrix between samples of in a group and all samples not in that group, over all groups.

References

Brown, C.D., and Davis, H.T. (2006) Receiver operating characteristics curves and related decision measures: A tutorial. Chemometrics and Intelligent Laboratory Systems 80, 24--38. Gavin, D.G., Oswald, W.W., Wahl, E.R. and Williams, J.W. (2003) A statistical approach to evaluating distance metrics and analog assignments for pollen records. Quaternary Research 60, 356--367.

Henderson, A.R. (1993) Assessing test accuracy and its clinical consequences: a primer for receiver operating characteristic curve analysis. Annals of Clinical Biochemistry 30, 834--846.

See Also

mat for fitting of MAT models. bootstrap.mat and mcarlo for alternative methods for selecting critical values of analogue-ness for dissimilarity coefficients.

Examples

Run this code
## continue the example from join()
example(join)

## fit the MAT model using the squared chord distance measure
swap.mat <- mat(swapdiat, swappH, method = "SQchord")

## fit the ROC curve to the SWAP diatom data using the MAT results
## Generate a grouping for the SWAP lakes
clust <- hclust(as.dist(swap.mat$Dij), method = "ward")
grps <- cutree(clust, 12)

## fit the ROC curve
swap.roc <- roc(swap.mat, groups = grps)
swap.roc

## fit a analogue matching (AM) model using the squared chord distance
## measure - need to keep the training set dissimilarities
swap.ana <- analog(swapdiat, rlgh, method = "SQchord",
                   keep.train = TRUE)

## fit the ROC curve to the SWAP diatom data using the AM results
## Generate a grouping for the SWAP lakes
clust <- hclust(as.dist(swap.ana$train), method = "ward")
grps <- cutree(clust, 12)

## fit the ROC curve
swap.roc2 <- roc(swap.ana, groups = grps)
swap.roc2

Run the code above in your browser using DataLab