Constructs a class cover catch digraph from points or interpoint distance matrices.
cccd(x = NULL, y = NULL, dxx = NULL, dyx = NULL, method = NULL,
k = NA, algorithm = 'cover_tree')
cccd.rw(x=NULL,y=NULL,dxx=NULL,dyx=NULL,method=NULL,m=1,d=2)
cccd.classifier(x,y,dom.method='greedy',proportion=1,...)
cccd.classify(data, C,method=NULL)
cccd.classifier.rw(x,y,m=1,d=2)
cccd.multiclass.classifier(data, classes, dom.method='greedy',proportion=1,...)
cccd.multiclass.classify(data,C,method=NULL)
# S3 method for cccd
plot(x, ..., plot.circles = FALSE, dominate.only = FALSE,
D = NULL, vertex.size = 2, vertex.label = NA,
vertex.color = "SkyBlue2", dom.color = "Blue",
ypch = 20, ycex = 1.5, ycol = 2,
use.circle.radii = FALSE, balls = FALSE,
ball.color = gray(0.8), square = FALSE, xlim, ylim)
# S3 method for cccdClassifier
plot(x, ..., xcol=1,ycol=2,xpch=20,ypch=xpch,
balls=FALSE,add=FALSE)
the target class and non-target class points. Either x,y
or dxx,dyx must be provided. In the case of plot
, x is
an object of class cccd.
interpoint distances (x against x and y against x). If these are not provided they are computed using x and y.
the method used for the distance.
See dist
.
the method used for the domination set
computation, and the proportion of points required to dominate.
See dominate
.
If given, get.knn
is used from FNN to approximate
the class cover catch graph. Each x covers no more than the k
nearest neighbors to it.
This will be much faster and use less memory for large data sets, but
is an approximation unless k
is sufficiently large.
See get.knn
.
slope of the null hypothesis curve
data to be classified
dimension of the data
class labels of the data
cccd object
logical. Plot the circles around the points if TRUE.
logical. Only plot the digraph induced by the dominating set.
a dominating set. Only used if dominate.only is TRUE. If
dominate.only is TRUE and D is NULL, then dominate
is called.
parameters controling
the plotting of the vertices. dom.color
is the color
of the vertices in the dominating set.
if balls
=TRUE, the cover is plotted
as filled balls, with
ball.color
controling their color. In the cass of
cccdClassifier
, balls
can be "x" or "y" indicating
that only one of the balls should be plotted.
parameters for plotting the non-target points.
parameters for plotting the first class points.
logical. Should the classifier plot be added to an existing plot?
logical. Ensure that the circles fit within the plot.
logical. Make the plot square.
if present, these control the plotting region.
arguments passed to cccd
or plot
.
an object of class igraph. In addition, it contains the attributes:
a vector of radii.
the y vectors.
the x vectors.
vectors of radii.
the ball centers.
The class cover catch digraph is a graph with vertices defined by the
points of x
and edges defined according to the balls
\(B(x,d(x,Y))\). There is an edge between vertices
\(x_1,x_2\) if \(x_2\in B(x_1,d(x_1,Y))\). If dyx
is not
given and the method is 'euclidean', then get.knnx
is used to
find the nearest y
to each x. If k
is given, only
the k
nearest neighbors to each point are candidates for
covering. Thus the cccd will be approximate, but the computation will
(generally) be faster. Since get.knn
uses Euclidean distance,
these choices will only be valid for this distance metric.
Since the graph will tend to be larger than
otherwise, the dominating set computation will be slower, so one
should trade-off speed of calculation, approximation, and the
proportion
option to the dominating set (which can make that
calculation faster at the cost of returning a subset of the dominating
set).
D.J. Marchette, "Class Cover Catch Digraphs", Wiley Interdisciplinary Reviews: Computational Statistics, 2, 171-177, 2010.
D.J. Marchette, Random Graphs for Statistical Pattern Recognition, John Wiley & Sons, 2004.
C.E. Priebe, D.J. Marchette, J. DeVinney and D. Socolinsky, "Classification Using Class Cover Catch Digraphs", Journal of Classification, 20, 3-23, 2003.
# NOT RUN {
set.seed(456330)
z <- matrix(runif(1000),ncol=2)
ind <- which(z[,1]<.5 & z[,2]<.5)
x <- z[ind,]
y <- z[-ind,]
g <- cccd(x,y)
C <- cccd.classifier(x,y)
z2 <- matrix(runif(1000),ncol=2)
ind <- which(z2[,1]<.5 & z2[,2]<.5)
cls <- rep(0,nrow(z2))
cls[ind] <- 1
out <- cccd.classify(z2,C)
sum(out != cls)/nrow(z2)
# }
# NOT RUN {
plot(g,plot.circles=TRUE,dominate.only=TRUE)
points(z2,col=2*(1-cls)+1,pch=20)
# }
Run the code above in your browser using DataLab