Learn R Programming

analogue (version 0.10-0)

prcurve: Fits a principal curve to m-dimensional data

Description

A principal curve is a non-parametric generalisation of the principal component and is a curve that passes through the middle of a cloud of data points for a certain definition of `middle'.

Usage

prcurve(X, method = c("ca", "pca", "random", "user"), start = NULL,
        smoother = smoothSpline, complexity, vary = FALSE,
        maxComp, finalCV = FALSE, axis = 1, rank = FALSE,
        stretch = 2, maxit = 10, trace = FALSE, thresh = 0.001,
        plotit = FALSE, ...)

initCurve(X, method = c("ca", "pca", "random", "user"), rank = FALSE, axis = 1, start)

smoothSpline(lambda, x, choose = TRUE, complexity, ..., penalty = 1, cv = FALSE, keep.data = FALSE, control.spar = list(low = 0))

Arguments

X
a matrix-like object containing the variables to which the principal curve is to be fitted.
method
character; method to use when initialising the principal curve. "ca" fits a correspondence analysis to X and uses the axis-th axis scores as the initial curve. "pca" does the same but fits a
start
numeric vector specifying the initial curve when method = "user". Must be of length nrow(X).
smoother
function; the choice of smoother used to fit the principal curve. Currently, the only option is smoothSpline which is a wrapper to smooth.spline.
complexity
numeric; the complexity of the fitted smooth functions.

The function passed as argument smoother should arrange for this argument to be passed on to relevant aspect of the underlying smoother. In the case of smoothSplin

vary
logical; should the complexity of the smoother fitted to each variable in X be allowed to vary (i.e. to allow a more or less smooth function for a particular variable. If FALSE the median complexity over all m
maxComp
numeric; the upper limt on the allowed complexity.
finalCV
logial; should a final fit of the smooth function be performed using cross validation?
axis
numeric; the ordinaion axis to use as the initial curve.
rank
logical; should rank position on the gradient be used? Not yet implemented.
stretch
numeric; a factor by which the curve can be extrapolated when points are projected. Default is 2 (times the last segment length).
maxit
numeric; the maximum number of iterations.
trace
logical; print progress on the iterations be printed to the console?
thresh
numeric; convergence threshold on shortest distances to the curve. The algorithm is considered to have converged when the latest iteration produces a total residual distance to the curve that is within thresh of the value obtained
plotit
logical; should the fitting process be plotted? If TRUE, then the fitted principal curve and observations in X are plotted in principal component space.
...
arguments passed on to lower functions. In the case of prcurve, these additional arguments are passed solely on to the function smoother.

In smoothSpline, ...is passed on the the underlying function

lambda
the current projection function; the position that each sample projects to on the current principal curve. This is the predictor variable or covariate in the smooth function.
x
numeric vector; a column from X used as the response variable in the smooth function. The principal curve algorithm fits a separate scatterplot smoother (or similar smoother) to each variable in X in turn as the respo
choose
logical; should the underlying smoother function be allowed to choose the degree of smooth complexity for each variable in X?
penalty, cv, keep.data, control.spar
arguments to smooth.spline.

Value

  • An object of class "prcurve" with the following components:
  • sa matrix corresponding to X, giving their projections onto the curve.
  • tagan index, such that s[tag, ] is smooth.
  • lambdafor each point, its arc-length from the beginning of the curve.
  • distthe sum-of-squared distances from the points to their projections.
  • convergedlogical; did the algorithm converge?
  • iternumeric; the number of iterations performed.
  • totalDistnumeric; total sum-of-squared distances.
  • complexitynumeric vector; the complexity of the smoother fitted to each variable in X.
  • callthe matched call.

Examples

Run this code
data(abernethy)

## Plot the most common taxa
Stratiplot(Age ~ . - Depth, data =
           chooseTaxa(abernethy, max.abun = 15, n.occ = 10),
           type = c("g","poly"), sort = "wa")

## Remove the Depth and Age variables
abernethy2 <- abernethy[, -(37:38)]

## Fit PCA and CA
aber.pca <- rda(abernethy2)
aber.ca <- cca(abernethy2)

## Fit the principal curve using the median complexity over
## all species
aber.pc <- prcurve(abernethy2, method = "ca", trace = TRUE,
                   vary = FALSE, penalty = 1.4)

## Fit the principal curve using varying complexity of smoothers
## for each species
aber.pc2 <- prcurve(abernethy2, method = "ca", trace = TRUE,
                    vary = TRUE, penalty = 1.4)

Run the code above in your browser using DataLab