Learn R Programming

cellWise (version 2.2.2)

wrap: Wrap the data.

Description

Transforms multivariate data X using the wrapping function with b = 1.5 and c = 4. By default, it starts by calling checkDataSet to clean the data and estLocScale to estimate the location and scale of the variables in the cleaned data. Alternatively, it works with user-provided vectors of location and scale given by locX and scaleX.

Usage

wrap(X, locX = NULL, scaleX = NULL, precScale = 1e-12,
     imputeNA = TRUE, checkPars = list())

Arguments

X

the input data. It must be an \(n\) by \(d\) matrix or a data frame.

locX

The location estimates of the columns of the input data X. Must be a vector of length \(d\).

scaleX

The scale estimates of the columns of the input data X. Must be a vector of length \(d\).

precScale

The precision scale used throughout the algorithm. Defaults to \(1e-12\)

imputeNA

Whether or not to impute the NAs with the location estimate of the corresponding variable. Defaults to TRUE.

checkPars

Optional list of parameters used in the call to checkDataSet. The options are:

  • coreOnly If TRUE, skip the execution of checkDataset. Defaults to FALSE

  • numDiscrete A column that takes on numDiscrete or fewer values will be considered discrete and not retained in the cleaned data. Defaults to \(5\).

  • precScale Only consider columns whose scale is larger than precScale. Here scale is measured by the median absolute deviation. Defaults to \(1e-12\).

  • silent Whether or not the function progress messages should be printed. Defaults to FALSE.

Value

A list with components:

  • Xw The wrapped data.

  • colInWrap The column numbers of the variables which were wrapped. Variables which were filtered out by checkDataSet (because of a (near) zero scale for example), will not appear in this output.

  • loc The location estimates for all variables used for wrapping.

  • scale The scale estimates for all variables used for wrapping.

References

Raymaekers, J., Rousseeuw P.J. (2019). Fast robust correlation for high dimensional data. Technometrics, published online. (link to open access pdf)

See Also

estLocScale

Examples

Run this code
# NOT RUN {
library(MASS) 
set.seed(12345) 
n <- 100; d <- 10
X <- mvrnorm(n, rep(0, 10), diag(10))
locScale <- estLocScale(X)
Xw <- wrap(X, locScale$loc, locScale$scale)$Xw
# For more examples, we refer to the vignette:
vignette("wrap_examples")
# }

Run the code above in your browser using DataLab