Learn R Programming

quickOutlier (version 0.1.0)

detect_multivariate: Detect Multivariate Anomalies (Mahalanobis Distance)

Description

Identifies outliers based on the relationship between multiple variables using Mahalanobis Distance. This is useful when individual values are normal, but their combination is anomalous (e.g., high weight for low height).

Usage

detect_multivariate(data, columns, confidence_level = 0.99)

Value

A data frame with the multivariate outliers and their Mahalanobis distance.

Arguments

data

A data frame.

columns

Vector of column names to analyze (must be numeric).

confidence_level

Numeric (0 to 1). The confidence cutoff for the Chi-square distribution. Defaults to 0.99 (99%).

Examples

Run this code
# Generate dataset (n=50) with strong correlation
df <- data.frame(x = rnorm(50), y = rnorm(50))
df$y <- df$x * 2 + rnorm(50, sd = 0.5) # y depends on x

# Add an anomaly: normal x, but impossible y
anomaly <- data.frame(x = 0, y = 10)
df <- rbind(df, anomaly)

# Detect
detect_multivariate(df, columns = c("x", "y"))

Run the code above in your browser using DataLab