Learn R Programming

faoutlier (version 0.2.2)

forward.search: Forward search algorithm for outlier detection

Description

The forward search algorithm begins by selecting a homogeneous subset of cases based on a maximum likelihood criteria and continues to add individual cases at each iteration given an acceptance criteria. By default the function add cases that contribute most to the likelihood function and that have the closest robust mahalanobis distance, however model implied residuals may be included as well.

Usage

forward.search(data, model, criteria = c("LD", "mah"),
    n.subsets = 1000, p.base = 0.4, na.rm = TRUE,
    digits = 5, print.messages = TRUE)

  ## S3 method for class 'forward.search':
print(x, stat = "LR", ...)

  ## S3 method for class 'forward.search':
plot(x, y = NULL, stat = "LR",
    main = "Forward Search", type = c("p", "h"),
    ylab = "obs.resid", ...)

Arguments

data
matrix or data.frame
model
if a single numeric number declares number of factors to extract in exploratory factor ansysis. If class(model) is a sem (or OpenMx model if installed from github) then a confirmatory approach is performed instead
criteria
character strings indicating the forward search method Can contain 'LD' for log-liklihood distance, 'mah' for Mahalanobis distance, or 'res' for model implied residuals
n.subsets
a scalar indicating how many samples to draw to find a homogenous starting base group
p.base
proportion of sample size to use as the base group
na.rm
logical; remove cases with missing data?
digits
number of digits to round in the final result
print.messages
logical; print how many iterations are remaining?
x
an object of class forward.search
stat
type of statistic to use. Could be 'LR', 'RMR', or 'gCD' for the likelihood ratio, root mean square residual, or generalized Cook's disntance, respectively
...
additional parameters to be passed
y
a null value ignored by plot
main
the main title of the plot
type
type of plot to use, default displayes points and lines
ylab
the y label of the plot

Details

Note that forward.search is not limited to confirmatory factor analysis and can apply to nearly any model being studied where detection of influential observations is important. If using the sem package forward.search can be very slow, and it's recommended that the user use OpenMx instead (see ?faoutlier for details).

See Also

gCD, LD, robustMD

Examples

Run this code
data(holzinger)
data(holzinger.outlier)

#Exploratory
nfact <- 3
(FS <- forward.search(holzinger, nfact))
(FS.outlier <- forward.search(holzinger.outlier, nfact))
plot(FS)
plot(FS.outlier)

#Confirmatory with sem
model <- specifyModel()
	  F1 -> V1,    lam11
	  F1 -> V2,    lam21
	  F1 -> V3,    lam31
	  F2 -> V4,    lam41
	  F2 -> V5,    lam52
	  F2 -> V6,    lam62
	  F3 -> V7,    lam73
	  F3 -> V8,    lam83
	  F3 -> V9,    lam93
	  F1 <-> F1,   NA,     1
	  F2 <-> F2,   NA,     1
	  F3 <-> F3,   NA,     1

(FS <- forward.search(holzinger, model))
(FS.outlier <- forward.search(holzinger.outlier, model))
plot(FS)
plot(FS.outlier)

#Confirmatory using OpenMx (requires github version, see ?faoutlier)
manifests <- colnames(holzinger)
latents <- c("F1","F2","F3")
#specify model, mxData not necessary but useful to check if mxRun works
model <- mxModel("Three Factor",
      type="RAM",
      manifestVars = manifests,
      latentVars = latents,
      mxPath(from="F1", to=manifests[1:3]),
	     mxPath(from="F2", to=manifests[4:6]),
	     mxPath(from="F3", to=manifests[7:9]),
      mxPath(from=manifests, arrows=2),
      mxPath(from=latents, arrows=2,
            free=FALSE, values=1.0),
      mxData(cov(holzinger), type="cov", numObs=nrow(holzinger))
	  )

(FS <- forward.search(holzinger, model))
(FS.outlier <- forward.search(holzinger.outlier, model))
plot(FS)
plot(FS.outlier)

Run the code above in your browser using DataLab