Learn R Programming

cellWise (version 2.2.2)

cellHandler: cellHandler algorithm

Description

This function flags cellwise outliers in X and imputes them, if robust estimates of the center mu and scatter matrix Sigma are given. When the latter are not known, as is typically the case, one can use the function DDC which only requires the data matrix X. Alternatively, the unknown center mu and scatter matrix Sigma can be estimated robustly from X by the function DI.

Usage

cellHandler(X, mu, Sigma, quant = 0.99)

Arguments

X

X is the input data, and must be an \(n\) by \(d\) matrix or a data frame.

mu

An estimate of the center of the data

Sigma

An estimate of the covariance matrix of the data

quant

Cutoff used in the detection of cellwise outliers. Defaults to 0.99

Value

A list with components:

  • Ximp The imputed data matrix.

  • indcells Indices of the cells which were flagged in the analysis.

  • indNAs Indices of the NAs in the data.

  • Zres Matrix with standardized cellwise residuals of the flagged cells. Contains zeroes in the unflagged cells.

  • Zres_denom Denominator of the standardized cellwise residuals.

  • cellPaths Matrix with the same dimensions as X, in which each row contains the path of least angle regression through the cells of that row, i.e. the order of the coordinates in the path (1=first, 2=second,...)

References

J. Raymaekers and P.J. Rousseeuw (2020). Handling cellwise outliers by sparse regression and robust covariance. Arxiv: 1912.12446. (link to open access pdf)

See Also

DI

Examples

Run this code
# NOT RUN {
mu <- rep(0, 3)
Sigma <- diag(3) * 0.1 + 0.9
X <- rbind(c(0.5, 1.0, 5.0), c(-3.0, 0.0, 1.0))
n <- nrow(X); d <- ncol(X)
out <- cellHandler(X, mu, Sigma)
Xres <- X - out$Ximp # unstandardized residual
mean(abs(as.vector(Xres - out$Zres*out$Zres_denom))) # 0
W <- matrix(rep(0,n*d),nrow=n) # weight matrix 
W[out$Zres != 0] <- 1 # 1 indicates cells that were flagged

# }

Run the code above in your browser using DataLab