This function estimates the cluster centers for each genotype dosage class based on the `theta` values (e.g., allelic ratios or normalized signal intensities). It supports imputing missing clusters and optionally removing outliers.
get_centers(
ratio_geno,
ploidy,
n.clusters.thr = NULL,
type = c("intensities", "counts"),
rm_outlier = TRUE,
cluster_median = TRUE
)A named list with the following elements: - `rm`: Integer flag: `0` (retained), `1` (no clusters found), or `2` (too few clusters). - `centers_theta`: A numeric vector of cluster center positions on the theta scale. - `MarkerName`: Marker identifier. - `n.clusters`: Number of clusters (including imputed ones if applicable).
A data.frame containing the following columns: - `MarkerName`: Identifier for each marker. - `SampleName`: Identifier for each sample. - `theta`: Numeric variable representing allelic ratio or signal intensity. - `geno`: Integer dosage (e.g., 0, 1, 2 for diploids).
Integer specifying the organism ploidy (e.g., 2 for diploid).
Integer specifying the minimum number of genotype clusters required for a marker to be retained. If fewer clusters are found, missing ones can be imputed depending on the `type`. Defaults to `ploidy + 1` if `NULL`.
Character string indicating the data source type: - `"intensities"`: For array-based allele intensities. - `"counts"`: For sequencing read counts. Default is `"intensities"`.
Logical; if `TRUE`, outlier samples within genotype clusters will be identified and removed prior to center calculation (default: `TRUE`).
Logical; if `TRUE`, cluster centers are calculated using the median of `theta` values. If `FALSE`, the mean is used (default: `TRUE`).