Learn R Programming

smsets (version 2.0.0)

Penrose.dist: Penrose's distance calculator

Description

Computes Penrose's distance between m multivariate populations or samples, when information is available on the means and variances.

Usage

Penrose.dist(x, group)

Value

Returns an object of class "Penrose.dist", a list containing the following components:

nameA character string describing the function.means.vecA numeric matrix with p rows and m columns giving the mean of each variable per group.covs.listA list containing the m sample covariance matrices.Samp.sizesA table showing the number of observations used in the calculation of the covariance matrix for each group.PooledCovThe pooled covariance matrix. This matrix can be accessed and used as an input argument for the calculation of Mahalanobis distance in packages biotools (da Silva, 2017, 2021) and ecodist (Goslee and Urban 2007).Penrose.matThe Penrose distances given as a "matrix" object.
Penros.distThe Penrose distances given as a "dist" object.groupA character string specifying the name of the classification factor defining groups.levels.groupa vector of length m, showing the levels in factor group.data.namea character string giving the name of the data.variablesa character string vector containing the variable names.datathe data frame analyzed.

Arguments

x

A data frame with \(p + 1\) columns (one factor and p response variables).

group

The classification factor defining m samples or groups. It must be one of the variables in x.

Author

Jorge Navarro Alberto, ganava4@gmail.com

Details

Let the mean of \(X_k\) in population i be \(\mu_{ki}\), \(k=1,...,p; i=1,...,m\) and assume that the variance of variable \(X_k\) is \(V_k\). The Penrose (1953) distance \(P_{ij}\) between population i and population j is given by

$$P_{ij} = \sum_{k = 1}^{p} \frac{(\mu_{ki} - \mu_{kj})^2}{pV_k}$$

Penrose's distances between multivariate samples are computed using this expression, but \(\mu_{ki}\), \(\mu_{kj}\) and \(V_k\) being replaced by their corresponding sample estimates.

A disadvantage of Penrose's measure is that it does not consider the correlations between the p variables.

The function requires package biotools (da Silva, 2017, 2021).

References

da Silva, A.R. (2021). biotools: Tools for Biometry and Applied Statistics in Agricultural Science. R package version 4.2. https://cran.r-project.org/package=biotools.

da Silva, A.R., Malafaia, G., and Menezes, I.P.P. (2017). biotools: an R function to predict spatial gene diversity via an individual-based approach. Genetics and Molecular Research 16. https://doi.org/10.4238/gmr16029655.

Goslee, S.C. and Urban, D.L. (2007). The ecodist package for dissimilarity-based analysis of ecological data. Journal of Statistical Software 22(7):1-19. DOI:10.18637/jss.v022.i07

Manly, B.F.J., Navarro Alberto, J.A. and Gerow, K. (2024) Multivariate Statistical Methods. A Primer. 5th Edn. Chapman and Hall/CRC.

Penrose, L.W. (1953). Distance, size and shape. Annals of Eugenics 18: 337-43.

Examples

Run this code
data(skulls)
res.Penrose <- Penrose.dist(x = skulls, group = Period)
# Brief output
res.Penrose

Run the code above in your browser using DataLab