geomorph (version 4.0.7)

gpagen: Generalized Procrustes analysis of points, curves, and surfaces

Description

A general function to perform Procrustes analysis of two- or three-dimensional landmark data that can include both fixed landmarks and sliding semilandmarks

Usage

gpagen(
  A,
  curves = NULL,
  surfaces = NULL,
  rot.pts = NULL,
  PrinAxes = TRUE,
  max.iter = NULL,
  tol = 1e-04,
  ProcD = FALSE,
  approxBE = FALSE,
  sen = 0.5,
  Proj = TRUE,
  verbose = FALSE,
  print.progress = TRUE,
  Parallel = FALSE
)

Value

An object of class gpagen returns a list with the following components:

coords

A (p x k x n) array of Procrustes shape variables, where p is the number of landmark points, k is the number of landmark dimensions (2 or 3), and n is the number of specimens. The third dimension of this array contains names for each specimen if specified in the original input array.

Csize

A vector of centroid sizes for each specimen, containing the names for each specimen if specified in the original input array.

iter

The number of GPA iterations until convergence was found (or GPA halted).

points.VCV

Variance-covariance matrix among Procrustes shape variables.

points.var

Variances of landmark points.

consensus

The consensus (mean) configuration.

procD

Procrustes distance matrix for all specimens (see details). Note that for large data sets, R might return a memory allocation error, in which case the error will be suppressed and this component will be NULL. For such cases, users can augment memory allocation and create distances with the dist function, independent from gpagen, using the coords or data output.

p

Number of landmarks.

k

Number of landmark dimensions.

nsliders

Number of semilandmarks along curves.

nsurf

Number of semilandmarks as surface points.

data

Data frame with an n x (pk) matrix of Procrustes shape variables and centroid size.

Q

Final convergence criterion value.

slide.method

Method used to slide semilandmarks.

call

The match call.

Arguments

A

Either an object of class geomorphShapes or a 3D array (p x k x n) containing landmark coordinates for a set of specimens. If A is a geomorphShapes object, the curves argument is not needed.

curves

An optional matrix defining which landmarks should be treated as semilandmarks on boundary curves, and which landmarks specify the tangent directions for their sliding. This matrix is generated with readland.shapes following digitizing of curves in StereoMorph, or may be generated using the function define.sliders.

surfaces

An optional vector defining which landmarks should be treated as semilandmarks on surfaces

rot.pts

An optional vector defining which landmarks are used for rotation. If NULL or "all", all fixed landmarks and semilandmarks will be used. If "fixed", only fixed landmarks will be used. If "sliders", only sliding semilandmarks will be used. Any other vector of values will direct which landmarks or semilandmarks will be used for rotation. Illogical vectors (like choosing points that do not exist or only choosing one point) will default to using all landmarks and semilandmarks.

PrinAxes

A logical value indicating whether or not to align the shape data by principal axes

max.iter

The maximum number of GPA iterations to perform before superimposition is halted. The final number of iterations will be larger than max.iter, if curves or surface semilandmarks are involved, as max.iter will pertain only to iterations with sliding of landmarks.

tol

A numeric value that can be adjusted for evaluation of the convergence criterion. If the criterion is lower than the tolerance, iterations will be stopped.

ProcD

A logical value indicating whether or not Procrustes distance should be used as the criterion for optimizing the positions of semilandmarks (if not, bending energy is used)

approxBE

A logical value for whether bending energy should be estimated via approximate thin-plate spline (TPS) mapping for sliding semilandmarks. Approximate TPS mapping is much faster and allows for more iterations, which might return more reliable results than few iterations with full TPS. If using full TPS, one should probably change max.iter to be few for large data sets.

sen

A numeric value between 0.1 and 1 to adjust the sensitivity of true bending energy to use in an approximated thin plate mapping of bending energy. This value is the proportion of landmarks (excluding semilandmarks) to seek for estimating bending energy. Sliding semilandmarks are always included in the final set of landmarks used to make estimates, so the actual sensitivity is higher than the chosen value.

Proj

A logical value indicating whether or not the Procrustes aligned specimens should be projected into tangent space

verbose

A logical value for whether to include statistics that take a long time to calculate with large data sets, including a Procrustes distance matrix among specimens, a variance-covariance matrix among Procrustes coordinates (shape variables), and variances of landmark points. Formatting a data frame for coordinates and centroid size is also embedded within this option. This argument should be FALSE unless explicitly needed. For large data sets, it will slow down the analysis, extensively.

print.progress

A logical value to indicate whether a progress bar should be printed to the screen.

Parallel

For sliding semi-landmarks only, either a logical value to indicate whether parallel processing should be used or a numeric value to indicate the number of cores to use in parallel processing via the parallel library. If TRUE, this argument invokes forking of all processor cores, except one. If FALSE, only one core is used. A numeric value directs the number of cores to use, but one core will always be spared. Parallel processing is probably only valuable with large data sets.

Author

Dean Adams and Michael Collyer

Details

The function performs a Generalized Procrustes Analysis (GPA) on two-dimensional or three-dimensional landmark coordinates. The analysis can be performed on fixed landmark points, semilandmarks on curves, semilandmarks on surfaces, or any combination. If data are provided in the form of a 3D array, all landmarks and semilandmarks are contained in this object. If this is the only component provided, the function will treat all points as if they were fixed landmarks. To designate some points as semilandmarks, one uses the "curves =" or "surfaces =" options (or both). To include semilandmarks on curves, a matrix defining which landmarks are to be treated as semilandmarks is provided using the "curves =" option. This matrix contains three columns that specify the semilandmarks and two neighboring landmarks which are used to specify the tangent direction for sliding. The matrix may be generated using the function define.sliders). Likewise, to include semilandmarks on surfaces, one must specify a vector listing which landmarks are to be treated as surface semilandmarks using the "surfaces=" option. The "ProcD = FALSE" option (the default) will slide the semilandmarks based on minimizing bending energy, while "ProcD = TRUE" will slide the semilandmarks along their tangent directions using the Procrustes distance criterion. The Procrustes aligned specimens may be projected into tangent space using the "Proj = TRUE" option. The function also outputs a matrix of pairwise Procrustes Distances, which correspond to Euclidean distances between specimens in tangent space if "Proj=TRUE", or to the geodesic distances in shape space if "Proj=FALSE". NOTE: Large datasets may exceed the memory limitations of R.

Generalized Procrustes Analysis (GPA: Gower 1975, Rohlf and Slice 1990) is the primary means by which shape variables are obtained from landmark data (for a general overview of geometric morphometrics see Bookstein 1991, Rohlf and Marcus 1993, Adams et al. 2004, Zelditch et al. 2012, Mitteroecker and Gunz 2009, Adams et al. 2013). GPA translates all specimens to the origin, scales them to unit-centroid size, and optimally rotates them (using a least-squares criterion) until the coordinates of corresponding points align as closely as possible. The resulting aligned Procrustes coordinates represent the shape of each specimen, and are found in a curved space related to Kendall's shape space (Kendall 1984). Typically, these are projected into a linear tangent space yielding Kendall's tangent space coordinates (i.e., Procrustes shape variables), which are used for subsequent multivariate analyses (Dryden and Mardia 1993, Rohlf 1999). Additionally, any semilandmarks on curves and surfaces are slid along their tangent directions or tangent planes during the superimposition (see Bookstein 1997; Gunz et al. 2005). Presently, two implementations are possible: 1) the locations of semilandmarks can be optimized by minimizing the bending energy between the reference and target specimen (Bookstein 1997), or by minimizing the Procrustes distance between the two (Rohlf 2010). Note that specimens are NOT automatically reflected to improve the GPA-alignment.

The generic functions, print, summary, and plot all work with gpagen. The generic function, plot, calls plotAllSpecimens.

Notes for geomorph 4.0

Starting with geomorph version 4.0, it is possible to use approximated thin-plate spline (TPS) mapping to estimate bending energy (when bending energy is used for sliding semi-landmarks). Approximated TPS has some computational benefit for large data sets (more GPA iterations are possible in a shorter time), but one should make sure first that results are reasonable. Approximated TPS uses a subset of points (semilandmarks) to estimate bending energy locally, where points are free to slide. Some subsets might be misrepresentative of the global (full configuration) bending energy, and strange results are possible. A comparison between TPS and approximated TPS outcomes could be tried with a few specimens to verify consistency of results before using approximated TPS on a large data set.

Choosing to use approximated TPS with only a few sliders is dangerous, as the subset of points might be too small to be reliable. Approximated TPS will work best for data sets with a sufficient number of surface landmarks. If a data set is nearly 100 % semilandmarks, TPS and approximated TPS should converge, but no time savings is likely.

Notes for geomorph 3.0

Compared to older versions of geomorph, users might notice subtle differences in Procrustes shape variables when using semilandmarks (curves or surfaces). This difference is a result of using recursive updates of the consensus configuration with the sliding algorithms (minimized bending energy or Procrustes distances). (Previous versions used a single consensus through the sliding algorithms.) Shape differences using the recursive updates of the consensus configuration should be highly correlated with shape differences using a single consensus during the sliding algorithm, but rotational "flutter" can be expected. This should have no qualitative effect on inferential analyses using Procrustes residuals.

References

Adams, D. C., F. J. Rohlf, and D. E. Slice. 2004. Geometric morphometrics: ten years of progress following the 'revolution'. It. J. Zool. 71:5-16.

Adams, D. C., F. J. Rohlf, and D. E. Slice. 2013. A field comes of age: Geometric morphometrics in the 21st century. Hystrix.24:7-14.

Bookstein, F. L. 1991. Morphometric tools for landmark data: Geometry and Biology. Cambridge Univ. Press, New York.

Bookstein, F. L. 1997. Landmark methods for forms without landmarks: morphometrics of group differences in outline shape. 1:225-243.

Dryden, I. L., and K. V. Mardia. 1993. Multivariate shape analysis. Sankhya 55:460-480.

Gower, J. C. 1975. Generalized Procrustes analysis. Psychometrika 40:33-51.

Gunz, P., P. Mitteroecker, and F. L. Bookstein. 2005. semilandmarks in three dimensions. Pp. 73-98 in D. E. Slice, ed. Modern morphometrics in physical anthropology. Klewer Academic/Plenum, New York.

Kendall, D. G. 1984. Shape-manifolds, Procrustean metrics and complex projective spaces. Bulletin of the London Mathematical Society 16:81-121.

Mitteroecker, P., and P. Gunz. 2009. Advances in geometric morphometrics. Evol. Biol. 36:235-247.

Rohlf, F. J., and D. E. Slice. 1990. Extensions of the Procrustes method for the optimal superimposition of landmarks. Syst. Zool. 39:40-59.

Rohlf, F. J., and L. F. Marcus. 1993. A revolution in morphometrics. Trends Ecol. Evol. 8:129-132.

Rohlf, F. J. 1999. Shape statistics: Procrustes superimpositions and tangent spaces. Journal of Classification 16:197-223.

Rohlf, F. J. 2010. tpsRelw: Relative warps analysis. Version 1.49. Department of Ecology and Evolution, State University of New York at Stony Brook, Stony Brook, NY.

Zelditch, M. L., D. L. Swiderski, H. D. Sheets, and W. L. Fink. 2012. Geometric morphometrics for biologists: a primer. 2nd edition. Elsevier/Academic Press, Amsterdam.

Examples

Run this code
if (FALSE) {
# Example 1: fixed points only
data(plethodon) 
Y.gpa <- gpagen(plethodon$land, PrinAxes = FALSE)
summary(Y.gpa)
plot(Y.gpa)

# Example 2: points and semilandmarks on curves
data(hummingbirds)

###Slider matrix
hummingbirds$curvepts

# Using bending energy for sliding
Y.gpa <- gpagen(hummingbirds$land, curves = hummingbirds$curvepts,
ProcD = FALSE)   
summary(Y.gpa)
plot(Y.gpa)


# Using Procrustes Distance for sliding
Y.gpa <- gpagen(hummingbirds$land, curves = hummingbirds$curvepts,
ProcD = TRUE)   
summary(Y.gpa)
plot(Y.gpa)


# Example 3: points, curves and surfaces
data(scallops)

# Using Procrustes Distance for sliding
Y.gpa <- gpagen(A = scallops$coorddata, curves = scallops$curvslide, 
surfaces = scallops$surfslide)
# NOTE can summarize as: summary(Y.gpa)
# NOTE can plot as: plot(Y.gpa) 
}

Run the code above in your browser using DataLab