DDCpredict: DDCpredict

Description

Based on a DDC fit on an initial (training) data set X, this function analyzes a new (test) data set Xnew.

Usage

DDCpredict(Xnew, InitialDDC, DDCpars = NULL)

Arguments

Xnew

The new data (test data), which must be a matrix or a data frame. It must always be provided.

InitialDDC

The output of the DDC function on the initial (training) dataset. Must be provided.

DDCpars

The input options to be used for the prediction. By default the options of InitialDDC are used.

Value

A list with components:

DDCpars

the options used in the call, see DDC.

locX

the locations of the columns, from InitialDDC.

scaleX

the scales of the columns, from InitialDDC.

Xnew standardized by locX and scaleX.

nbngbrs

predictions use a combination of nbngbrs columns.

ngbrs

for each column, the list of its neighbors, from InitialDDC.

robcors

for each column, the correlations with its neighbors, from InitialDDC.

robslopes

slopes to predict each column by its neighbors, from InitialDDC.

deshrinkage

for each connected column, its deshrinkage factor used in InitialDDC.

Xest

predicted values for every cell of Xnew.

scalestres

scale estimate of the residuals (Xnew - Xest), from InitialDDC.

stdResid

columnwise standardized residuals of Xnew.

indcells

positions of cellwise outliers in Xnew.

outlyingness of rows in Xnew.

medTi

median of the Ti in InitialDDC.

madTi

mad of the Ti in InitialDDC.

indrows

row numbers of the outlying rows in Xnew.

indNAs

positions of the NA's in Xnew.

indall

positions of NA's and outlying cells in Xnew.

Ximp

Xnew where all cells in indall are imputed by their prediction.

References

Hubert, M., Rousseeuw, P.J., Van den Bossche W. (2019). MacroPCA: An all-in-one PCA method allowing for missing values as well as cellwise and rowwise outliers. Technometrics, 61(4), 459-473.

Examples

Run this code

# NOT RUN {
library(MASS) 
set.seed(12345) 
n <- 100; d <- 10
A <- matrix(0.9, d, d); diag(A) = 1
x <- mvrnorm(n, rep(0,d), A)
x[sample(1:(n * d), 50, FALSE)] <- NA
x[sample(1:(n * d), 50, FALSE)] <- 10
x <- cbind(1:n, x)
DDCx <- DDC(x)
xnew <- mvrnorm(50, rep(0,d), A)
xnew[sample(1:(50 * d), 50, FALSE)] <- 10
predict.out <- DDCpredict(xnew, DDCx)
cellMap(xnew, predict.out$stdResid,
columnlabels = 1:d, rowlabels = 1:50)

# For more examples, we refer to the vignette:
vignette("DDC_examples")
# }

Run the code above in your browser using DataLab