Learn R Programming

sicure (version 0.1.1)

kNN.Mahalanobis: K Nearest Neighbors with Mahalanobis Distance

Description

This function computes the k nearest neighbors for a given set of data points, where each observation is a pair of the form \((X, T)\), with \(X\) representing a covariate and \(T\) the observed time. The distance between each pair of points is computed using the Mahalanobis distance: $$ d_M((X_i, T_i), (X_j, T_j)) = \sqrt{ \left( \begin{pmatrix} X_i \\ T_i \end{pmatrix} - \begin{pmatrix} X_j \\ T_j \end{pmatrix} \right)^t \Sigma^{-1} \left( \begin{pmatrix} X_i \\ T_i \end{pmatrix} - \begin{pmatrix} X_j \\ T_j \end{pmatrix} \right) }, $$ where \(\Sigma\) is the variance-covariance matrix of the joint distribution of \((X, T)\).

Usage

kNN.Mahalanobis(x, time, k)

Value

A matrix with \(n\) rows and k columns. Each row represents each pair \((X_i, T_i)\). The values in each row give the index of the k nearest neighbors considering Mahalanobis distance.

Arguments

x

A numeric vector of length \(n\) giving the covariate values.

time

A numeric vector giving the observed times.

k

The number of nearest neighbors to search.

References

Mahalanobis, P. C. (1936). On the generalised distance in statistics. Proceedings of the National Institute of Sciences of India, 2, 49-55.

Examples

Run this code
# Some artificial data
set.seed(123)
n <- 50
x <- runif(n, -2, 2) # Covariate values
y <- rweibull(n, shape = 0.5 * (x + 4)) # True lifetimes
c <- rexp(n) # Censoring values
p <- exp(2*x)/(1 + exp(2*x)) # Probability of being susceptible
u <- runif(n)
t  <- ifelse(u < p, pmin(y, c), c) # Observed times
d  <- ifelse(u < p, ifelse(y < c, 1, 0), 0) # Uncensoring indicator
kNN.Mahalanobis(x=x, time=t, k=5)

Run the code above in your browser using DataLab