Learn R Programming

analogue (version 0.14-0)

prcurve: Fits a principal curve to m-dimensional data

Description

A principal curve is a non-parametric generalisation of the principal component and is a curve that passes through the middle of a cloud of data points for a certain definition of `middle'.

Usage

prcurve(X, method = c("ca", "pca", "random", "user"), start = NULL,
        smoother = smoothSpline, complexity, vary = FALSE,
        maxComp, finalCV = FALSE, axis = 1, rank = FALSE,
        stretch = 2, maxit = 10, trace = FALSE, thresh = 0.001,
        plotit = FALSE, ...)

initCurve(X, method = c("ca", "pca", "random", "user"), rank = FALSE, axis = 1, start)

Arguments

X
a matrix-like object containing the variables to which the principal curve is to be fitted.
method
character; method to use when initialising the principal curve. "ca" fits a correspondence analysis to X and uses the axis-th axis scores as the initial curve. "pca" does the same but fits a
start
numeric vector specifying the initial curve when method = "user". Must be of length nrow(X).
smoother
function; the choice of smoother used to fit the principal curve. Currently, the only options are smoothSpline, which is a wrapper to smooth.spline, and smoothGAM
complexity
numeric; the complexity of the fitted smooth functions. The function passed as argument smoother should arrange for this argument to be passed on to relevant aspect of the underlying smoother. In the case of smoothS
vary
logical; should the complexity of the smoother fitted to each variable in X be allowed to vary (i.e. to allow a more or less smooth function for a particular variable. If FALSE the median complexity over all m
maxComp
numeric; the upper limt on the allowed complexity.
finalCV
logial; should a final fit of the smooth function be performed using cross validation?
axis
numeric; the ordinaion axis to use as the initial curve.
rank
logical; should rank position on the gradient be used? Not yet implemented.
stretch
numeric; a factor by which the curve can be extrapolated when points are projected. Default is 2 (times the last segment length).
maxit
numeric; the maximum number of iterations.
trace
logical; print progress on the iterations be printed to the console?
thresh
numeric; convergence threshold on shortest distances to the curve. The algorithm is considered to have converged when the latest iteration produces a total residual distance to the curve that is within thresh of the value obtained
plotit
logical; should the fitting process be plotted? If TRUE, then the fitted principal curve and observations in X are plotted in principal component space.
...
additional arguments are passed solely on to the function smoother.

Value

  • An object of class "prcurve" with the following components:
  • sa matrix corresponding to X, giving their projections onto the curve.
  • tagan index, such that s[tag, ] is smooth.
  • lambdafor each point, its arc-length from the beginning of the curve.
  • distthe sum-of-squared distances from the points to their projections.
  • convergedlogical; did the algorithm converge?
  • iternumeric; the number of iterations performed.
  • totalDistnumeric; total sum-of-squared distances.
  • complexitynumeric vector; the complexity of the smoother fitted to each variable in X.
  • callthe matched call.
  • ordinationan object of class "rda", the result of a call to rda. This is a principal components analysis of the input data X.
  • dataa copy of the data used to fit the principal curve.

See Also

smoothGAM and smoothSpline for the wrappers fitting smooth functions to each variable.

Examples

Run this code
## Load Abernethy Forest data set
data(abernethy)

## Remove the Depth and Age variables
abernethy2 <- abernethy[, -(37:38)]

## Fit the principal curve using the median complexity over
## all species
aber.pc <- prcurve(abernethy2, method = "ca", trace = TRUE,
                   vary = FALSE, penalty = 1.4)

## Extract fitted values
fit <- fitted(aber.pc) ## locations on curve
abun <- fitted(aber.pc, type = "smooths") ## fitted response

## Fit the principal curve using varying complexity of smoothers
## for each species
aber.pc2 <- prcurve(abernethy2, method = "ca", trace = TRUE,
                    vary = TRUE, penalty = 1.4)

## Predict new locations
take <- abernethy2[1:10, ]
pred <- predict(aber.pc2, take)

## Fit principal curve using a GAM - currently slow ~10secs
aber.pc3 <- prcurve(abernethy2, method = "ca", trace = TRUE,
                    vary = TRUE, smoother = smoothGAM, bs = "cr")

Run the code above in your browser using DataLab