opa: Finding the most dissimilar variables in a data matrix: the Orthogonal Projection Approach

Description

This function finds the set of most dissimilar rows in a data matrix. If no initial selection is presented, the first object is selected by comparison with the vector of column means. As a distance function the determinant of the crossproduct matrix is used.

Usage

opa(x, ncomp, initXref = NULL)

Arguments

Data matrix (numerical). May not contain missing values.

ncomp

Number of rows to be selected.

initXref

Optional matrix to be expanded - should be a subset of the rows to select.

Value

The function returns a submatrix of X, where the columns contain the (unit-length scaled) spectra from the input data that are most dissimilar.

References

F. Questa Sanchez et al.: Algorithm for the assessment of peak purity in liquid chromatography with photodiode-array detection. Analytica Chimica Acta 285:181-192 (1994)

R. Wehrens: Chemometrics with R. Springer Verlag, Heidelberg (2011)

Examples

Run this code

data(tea)

tea <- lapply(tea.raw, preprocess, maxI = 100)

ncomp <- 7
spectra <- opa(tea, ncomp)

myPalette <- colorRampPalette(c("black", "red", "blue", "green"))
mycols <- myPalette(ncomp)
matplot(as.numeric(rownames(spectra)), spectra, type = "l", lty = 1,
        xlab = expression(lambda), ylab = "", col = mycols)
legend("topright", legend = paste("Comp.", 1:ncomp), col = mycols,
       lty = 1, ncol = 2, bty = "n")

Run the code above in your browser using DataLab