Learn R Programming

Qploidy (version 1.0.1)

rm_outlier: Identify and Remove Outliers Based on Bonferroni-Holm Adjusted P-values

Description

This function detects and removes outlier observations from a vector of `theta` values using externally studentized residuals and the Bonferroni-Holm adjustment for multiple testing. It is typically used during genotype cluster center estimation to clean noisy values.

Usage

rm_outlier(data, alpha = 0.05)

Value

A data.frame containing only the non-outlier observations from the input. If fewer than two non-NA `theta` values are present or if all values are identical, the input is returned unmodified.

Arguments

data

A data.frame containing a `theta` column. This is usually a subset of the full dataset, representing samples within a single genotype class.

alpha

Significance level for identifying outliers (default is `0.05`). Observations with adjusted p-values below this threshold will be removed.

Author

Kaio Olympio

Details

The method fits a constant model (`theta ~ 1`) and computes standardized residuals. Observations with significant deviation are flagged using the Bonferroni-Holm procedure and removed if their adjusted p-value is below the defined `alpha` threshold.

This function was originally developed by **Kaio Olympio** and incorporated into the Qploidy workflow.