Learn R Programming

rgr (version 1.1.0)

gx.mvalloc: Function for Allocation on the basis of Multivariate Data

Description

Function to allocate individuals (observations, cases or samples) into one of up to nine (9) reference groups (populations) on the basis of their Mahalanobis distances. If an individual's predicted probability of group membership (typicality) falls below a user defined cut-off, pcrit, the individual is allocated to an outlier bin.

Usage

gx.mvalloc(pcrit = 0.05, x, ...)

Arguments

pcrit
the critical cut-off probability for group membership below which an individual will be classified as an outlier. By default the critical probability of group membership is set to pcrit = 0.05.
x
a n by p matrix containing the n individuals, with p variables determined on each, to be allocated, see Details below.
...
a list of objects, up to a maximum of nine (9), saved from any of functions gx.md.gait, gx.mva or gx.robmva

Value

  • The following are returned as an object to be saved for display with gx.mvalloc.print:
  • groupsa list of the names of the kk reference groups.
  • kkthe number of reference groups passed to the function.
  • nthe number of individuals (observations, cases or samples) allocated.
  • pthe number of variables in the reference and allocated data.
  • pcritthe critical cut-off probability for reference group membership.
  • pgma vector of kk predicted probabilities of reference group memberships.
  • xallocthe reference group, 1:kk, that the individual was allocated into. All outliers, i.e. all pgm(1:kk) < crit are allocated to group zero, 0. Therefore xalloc will be in the range of 0:kk.

Details

It is imperative that the data matrix x contains no special codes and all records (individuals) with NAs have been removed, see Notes below. It is also imperative that the variables in the reference groups and in the matrix x of individuals to be classified are identical in number and in the same order. The allocations are made on the assumption that the covariance structures are inhomogeneous, i.e. that the population hyperellipsoids are of different size, shape and orientation in p-space.

References

Garrett, R.G., 1990. A robust multivariate allocation procedure with applications to geochemical data. In Proc. Colloquium on Statistical Applications in the Earth Sciences (Eds F.P. Agterberg & G.F. Bonham-Carter). Geological Survey of Canada Paper 89-9, pp. 309-318. Reimann, C., Filzmoser, P., Garrett, R. and Dutter, R., 2008. Statistical Data Analysis Explained: Applied Environmental Statistics with R. John Wiley & Sons, Ltd., 362 p.

See Also

gx.md.gait, gx.mva, gx.robmva, gx.mvalloc.print, ltdl.fix.df, remove.na, na.omit

Examples

Run this code
## Generate three groups of synthetic multivariate normal data
grp1 <- mvrnorm(100, mu = c(40, 30), Sigma = matrix(c(6, 3, 3, 2), 2, 2))
grp1 <- cbind(grp1, rep(1, 100))
grp2 <- mvrnorm(100 ,mu = c(50, 40), Sigma = matrix(c(4, -3, -3, 5), 2, 2))
grp2 <- cbind(grp2, rep(2, 100))
grp3 <- mvrnorm(100 ,mu = c(30, 45), Sigma = matrix(c(6, 4, 4, 5), 2, 2))
grp3 <- cbind(grp3 ,rep(3, 100))
## Generate a set of six (6) outliers
anom <- matrix(c(35, 40, 25, 60, 25, 60, 35, 40, 25, 60, 60, 25),6, 2)
anom <- cbind(anom, rep(4, 6))
## Bind the test data sets together and display the test data 
test.mvalloc.mat <- rbind(grp1, grp2, grp3, anom)
test.mvalloc <- as.data.frame(test.mvalloc.mat)
dimnames(test.mvalloc)[[2]] <- c("x","y","grp")
attach(test.mvalloc)
xyplot.tags(x, y, grp, cex = 0.75)

## Generate robust reference groups 
test.save.grp1 <- gx.md.gait(grp1[, -3], mcdstart = TRUE)
test.save.grp2 <- gx.md.gait(grp2[, -3], mcdstart = TRUE)
test.save.grp3 <- gx.md.gait(grp3[, -3], mcdstart = TRUE)

## Allocate the synthetic data into the three reference groups
test.save.mvalloc <- gx.mvalloc(pcrit = 0.05, test.mvalloc.mat[,-3],
test.save.grp1, test.save.grp2, test.save.grp3)
## Display the results of the allocation
xyplot.tags(x, y, test.save.mvalloc$xalloc, cex = 0.75)
gx.mvalloc.print(test.save.mvalloc)

## Save the allocation as a csv file
gx.mvalloc.print(test.save.mvalloc, ifprint = FALSE,
file = "test.mvalloc.csv")

## Clean-up and detach synthetic test data
rm(grp1)
rm(grp2)
rm(grp3)
rm(anom)
rm(test.mvalloc)
rm(test.save.grp1)
rm(test.save.grp2)
rm(test.save.grp3)
rm(test.save.mvalloc)
detach(test.mvalloc)

## Make test data available
data(ogrady)
attach(ogrady)
ogrady.grdr <- gx.subset(ogrady, Lith == "GRDR")
ogrady.grnt <- gx.subset(ogrady, Lith == "GRNT")
## Ensure all data are in the same units (mg/kg)
ogrady.grdr.2open <- ogrady.grdr[, c(5:14)]
ogrady.grdr.2open[, 1:7] <- ogrady.grdr.2open[, 1:7] * 10000
ogrady.grnt.2open <- ogrady.grnt[, c(5:14)]
ogrady.grnt.2open[, 1:7] <- ogrady.grnt.2open[, 1:7] * 10000
ogrady.2open <- ogrady[, c(5:14)]
ogrady.2open[, 1:7] <- ogrady.2open[, 1:7] * 10000 

## Create reference data sets
ogrady.grdr.save <- gx.md.gait(ilr(as.matrix(ogrady.grdr.2open)),
mcdstart = TRUE)
ogrady.grnt.save <- gx.md.gait(ilr(as.matrix(ogrady.grnt.2open)),
mcdstart = TRUE)

## Allocate all O'Grady granitoids
ogrady.mvalloc <- gx.mvalloc(pcrit = 0.02, ilr(as.matrix(ogrady.2open)),
ogrady.grdr.save, ogrady.grnt.save)

## Display list of outliers
gx.mvalloc.print(ogrady.mvalloc)

## Display allocations
ogrady.mvalloc$xalloc

## Save allocations as a csv file
gx.mvalloc.print(ogrady.mvalloc, ifprint = FALSE,
file = "ogrady.mvalloc.csv")

## Clean-up and detach test data
rm(ogrady.grdr)
rm(ogrady.grnt)
rm(ogrady.grdr.2open)
rm(ogrady.grnt.2open)
rm(ogrady.2open)
rm(ogrady.grdr.save)
rm(ogrady.grnt.save)
rm(ogrady.mvalloc)
detach(ogrady)

Run the code above in your browser using DataLab