mrfDepth (version 1.0.12)

fOutl: Functional outlyingness measures for functional data

Description

Computes several measures of functional outlyingness for multivariate functional data.

Usage

fOutl(x, z = NULL, type = "fAO", alpha = 0, time = NULL, 
        diagnostic = FALSE, distOptions = NULL)

Arguments

x

A three dimensional \(t\) by \(n\) by \(p\) array, with \(t\) the number of observed time points, \(n\) the number of functional observations and \(p\) the number of measurements for every functional observation at every time point.

z

An optional three-dimensional \(t\) by \(m\) by \(p\) array, containing the observations for which to compute the functional outlyingness with respect to x. If z is not specified, it is set equal to x. The time points of z should correspond to those of x.

type

The depth used in the computations. The outlyingness measure used in the computations. One of the following options: "fSDO", "fAO", "fDO" or "fbd". Defaults to "fAO".

alpha

Specifies the weights at every cross-section. When alpha = 0, uniform weights are used. Otherwise alpha should be a weight vector of length \(t\). Defaults to 0.

time

If the measurements are not equidistant, a sorted numeric vector containing a set of time points. Defaults to 1:t.

diagnostic

If set to TRUE, the output contains some additional components: crossDists: an \(n\) by \(t\) matrix containing the multivariate outlyingness of each observation at each time point locOutl: output containing flags for local outlyingness (see "Value" for more details) Defaults to FALSE.

distOptions

A list of options to pass to the function calculating the cross-sectional distances. See outlyingness, adjOutl, or bagdistance.

Value

A list with the following components:

fOutlyingnessX

Vector of length \(n\) containing the functional outlyingness of every curve from x.

fOutlyingnessZ

Vector of length \(m\) containing the functional outlyingness of every curve from z.

weights

Vector of weights according to the input parameter alpha.

crossDistsX

An \(n\) by \(t\) matrix containing the multivariate outlyingness of each observation of x at each point. Only provided if the input parameter diagnostic is set to TRUE.

crossDistsZ

An \(m\) by \(t\) matrix containing the multivariate outlyingness of each observation of z at each point. Only provided if the input parameter diagnostic is set to TRUE.

locOutlX

An \(n\) by \(t\) matrix flagging local outlyingness for x. Only provided if the input parameter diagnostic is set to TRUE. The \((i,j)\)th element takes value 1 if curve \(x_i\) is outlying at time point \(j\).

locOutlZ

An \(m\) by \(t\) matrix flagging local outlyingness for z. Only provided if the input parameter diagnostic is set to TRUE. The \((i,j)\)th element takes value 1 if curve \(z_i\) is outlying at time point \(j\).

IndFlagExactFit

Vector containing the indices of the time points for which an exact fit is detected.

Details

The functional outlyingness of a multivariate curve with respect to a given set of multivariate curves is defined as the weighted average of its multivariate outlyingness at each time point (Hubert et al., 2015). The functional outlyingness can be computed in all dimensions \(p\) using bagdistance, projection depth and skewness-adjusted projection depth.

When the data array z is specified, the functional outlyingness and diagnostic information for the data array x is also returned whenever the underlying outlyingness routine allows it. For more information see the specific routines listed in the section "See Also"

In some situations, additional diagnostics are available to flag outlying time points. At each time point, observations from the data array x are marked if they are flagged as outliers. The observations from the data array x are marked if their scaled outlyingness is larger than a prescribed cut-off value from the chi-square distribution. For more details see the respective outlyingness routines

It is possible that at certain time points a part of the algorithm can not be executed due to e.g. exact fits. In that case the weight of that particular time point is set to zero. A warning is issued at the end of the algorithm to signal these time points. Furthermore the output contains an extra argument giving the indices of the time points where problems occured.

References

Hubert M., Rousseeuw P.J., Segaert P. (2015). Multivariate functional outlier detection. Statistical Methods and Applications, 24(2), 177--202.

Hubert M., Rousseeuw P.J., Segaert P. (2017). Multivariate and functional classification using depth and distance. Advances in Data Analysis and Classification, 11, 445--466.

See Also

bagdistance, outlyingness, adjOutl

Examples

Run this code
# NOT RUN {
# We will illustrate the function using a univariate functional sample.
data(octane)
Data <- octane

# When the option diagnostic is set to TRUE, a crude diagnostic
# to detect outliers can be extracted from the local outlyingness
# indicators. 
Result <- fOutl(x = Data, type = "fAO", diagnostic = TRUE)
matplot(Data[,,1], type = "l", col = "black", lty = 1)
for (i in 1:dim(Data)[2]) {
  if(sum(Result$locOutlZ) > 0) {
    obsData <- matrix(Data[,i,1], nrow = 1)
    obsData[!Result$locOutlZ[i,]] <- NA
    obsData <- rbind(obsData, obsData)
    matpoints(t(obsData), col = "red", pch = 15)
  }
}
# For more advance outlier detection techniques, see the 
# fom routine.
# }

Run the code above in your browser using DataLab