Learn R Programming

lga (version 1.0-0)

lga: Perform LGA

Description

Linear Grouping Analysis

Usage

lga(x, k, biter = NULL, niter = 10, showall = FALSE, scale = TRUE,
    nnode=NULL, silent=FALSE)

Arguments

x
a numeric matrix.
k
an integer for the number of clusters.
biter
an integer for the number of different starting hyperplanes to try.
niter
an integer for the number of iterations to attempt for convergence.
showall
logical. If TRUE then display all the outcomes, not just the best one.
scale
logical. Allows you to scale the data, dividing each column by its standard deviation, before fitting.
nnode
an integer of many CPUS to use for parallel processing. Defaults to NULL i.e. no parallel processing.
silent
logical. If TRUE, produces no text output during processing.

Value

  • An object of class lga with components
  • clustera vector containing the cluster memberships.
  • ROSSthe Residual Orthogonal Sum of Squares for the solution.
  • convergeda logical. True if at least one solution has converged.
  • biterthe biter setting used.
  • niterthe niter setting used.
  • nconvergthe number of converged solutions (out of biter starts).
  • scaledlogical. Is the data scaled?
  • kthe number of clusters to be found.
  • xthe (scaled if selected) dataset.

Details

This code tries to find k clusters using the lga algorithm described in Van Aelst et al (2006). For each attempt, it has up to niter steps to get to convergence, and it does this from biter different starting hyperplanes. It then selects the clustering with the smallest Residual Orthoganal Sum of Squareds. If biter is left as NULL, then it is selected via the equation given in Van Aeslt et al (2006).

This function is parallel computing aware via the nnode argument, and works with the package snow. In order to use parallel computing, one of MPI (e.g. lamboot) or PVM is necessary. For further details, see the documentation for snow. Associated with the lga function are a print method and a plot method (see the examples). In the plot method, the fitted hyperplanes are also shown as dashed-lines. When there are more than 2 dimensions, these represent the intersection of the fitted hyperplanes onto the hyperplanes for each pair of axes.

References

Van Aelst, S. and Wang, X. and Zamar, R. and Zhu, R. (2006) Linear Grouping Using Orthogonal Regression, Computational Statistics & Data Analysis 50, 1287--1312.

See Also

gap

Examples

Run this code
## Synthetic Data
## Make a dataset with 2 clusters in 2 dimensions

library(MASS)
set.seed(1234)
X <- rbind(mvrnorm(n=100, mu=c(1,-1), Sigma=diag(0.1,2)+0.9),
            mvrnorm(n=100, mu=c(1,1), Sigma=diag(0.1,2)+0.9))

lgaout <- lga(X,2)
plot(lgaout)
print(lgaout)


## nhl94 data set

data(nhl94)
plot(lga(nhl94, k=3, niter=30))


## Allometry data set
data(brain)
plot(lga(log(brain, base=10), k=3))


## Second Allometry data set
data(ob)
plot(lga(log(ob[,2:3]), k=3), pch=as.character(ob[,1]))


## Parallel processing case
## In this example, running using 4 nodes. 

set.seed(1234)
X <- rbind(mvrnorm(n=1e6, mu=c(1,-1), Sigma=diag(0.1,2)+0.9),
            mvrnorm(n=1e6, mu=c(1,1), Sigma=diag(0.1,2)+0.9))
abc <- lga(X, k=2, nnode=4)

Run the code above in your browser using DataLab