Rc(x,...)## S3 method for class 'lpc':
Rc(x,...)
## S3 method for class 'lpc.spline':
Rc(x,...)
## S3 method for class 'ms':
Rc(x,...)
base.Rc(data, closest.coords, type="curve")
Rc
computes the coverage coefficient $R_C$, a quantity which
estimates the goodness-of-fit of a fitted principal object. This
quantity can be interpreted similar to the coeffient of determination in
regression analysis: Values close to 1 indicate a good fit, while values
close to 0 indicate a `bad' fit (corresponding to linear PCA).
For objects of type lpc
, lpc.spline
, and ms
, S3 methods are available which use the generic function
Rc
. This, in turn, calls the base function base.Rc
, which
can also be used manually if the fitted object is of another class.
In principle, function base.Rc
can be used for assessing
goodness-of-fit of any principal object provided that
the coordinates (closest.coords
) of the projected data are
available. For instance, for HS principal curves fitted via
princurve
, this information is contained in component $s
,
and for a a k-means object, say fitk
, this information can be
obtained via fitk$centers[fitk$cluster,]
. Set type="points"
in
the latter case.
The function Rc
attempts to compute all missing information, so
computation will take the longer the less informative the given
object x
is. Note also, Rc
looks up the option scaled
in the fitted
object, and accounts for the scaling automatically. Important: If the data
were scaled, then do NOT unscale the results by hand in order to feed
the unscaled version into base.Rc
, this will give a wrong result.
In terms of methodology, these functions compute $R_C$ directly through the mean reduction of absolute residual length, rather than through the area above the coverage curve.
These functions do currently not account for observation weights, i.e. $R_C$ is computed through the unweighted mean reduction in absolute residual length (even if weights have been used for the curve fitting).
Einbeck (2011). Bandwidth selection for nonparametric unsupervised learning techniques -- a unified approach via self-coverage. Journal of Pattern Recognition Research 6, 175-192.
lpc.spline
, code{ms}, coverage
.data(calspeedflow)
lpc1 <- lpc.spline(lpc(calspeedflow[,3:4]), project=TRUE)
Rc(lpc1)
# is the same as:
base.Rc(lpc1$lpcobject$data, lpc1$closest.coords)
ms1 <- ms(calspeedflow[,3:4],plotms=0)
Rc(ms1)
# is the same as:
base.Rc(ms1$data, ms1$cluster.center[ms1$closest.label,], type="points")
Run the code above in your browser using DataLab