Learn R Programming

MicrobTiSDA (version 0.1.0)

mclr.transform: Modified Centered Log-Ratio (MCLR) Transformation

Description

Applies a modified centered log-ratio (MCLR) transformation to compositional data. This transformation is particularly useful in microbiome and compositional data analysis, as it normalizes the data by comparing each value to the geometric mean of the positive values in its row.

Usage

mclr.transform(Z, base = exp(1), eps = 0.1)

Value

A data matrix of the same size as Z after the modified centered log-ratio transformation.

Arguments

Z

A numeric matrix or data frame containing the compositional data to be transformed.

base

A numeric value specifying the logarithmic base to use (default is exp(1), i.e., the natural logarithm).

eps

A small positive constant added to the transformed data to ensure positivity and avoid zeros (default is 0.1).

Details

The MCLR method calculates the geometric mean of each sample from positive proportions only, normalized and log-transformation all non-zero components in the dataset. Specifically, let \(x_{nt} \in \Omega^I\) denotes the compositional vector for the sample from subject \(\textit{n}\) at timepoint \(\textit{t}\), where \(\Omega^I\) represents the collection of \(\textit{I}\) microbial features.For simplicity of illustration, assume that the first \(\textit{q}\) elements of \(x_{nt}\) are zero while the remaining elements are non-zero. Then itcan be expressed as: $$mclr_\epsilon (x_{nt}) = [0, \dots, 0, \ln{\left(\frac{x_{nt(q+1)}}{\tilde{g}(x_{nt})}\right)} + \epsilon, \dots, \ln{\left(\frac{x_{ntI}}{\tilde{g}(x_{nt})}\right)} + \epsilon]$$ where \(\tilde{g}(x_{nt}) = \left(\prod_{i=q+1}^{p} x_{nti}\right)^{\frac{1}{I-q}}\) is the geometric mean of the non-zero elements of \(x_{nt}\). When \(\varepsilon = 0\), \(\text{mclr}_0\) corresponds to the centered log-ratio transform applied to non-zero proportions only. When \(\varepsilon > 0\), \(\text{mclr}_\varepsilon\) applies a positive shift to all non-zero compositions. To make all non-zero values strictly positive, by default \(\varepsilon = 0.1\). The MCLR transformation is invariant to the addition of extra zero components, preserves the original zero measurements, and is overal rank preserving. For more details, see Yoon et al. (2019).

References

Yoon, Grace, Irina Gaynanova, and Christian L. Müller. "Microbial networks in SPRING-Semi-parametric rank-based correlation and partial correlation estimation for quantitative microbiome data." Frontiers in Genetics 10 (2019).

Examples

Run this code
# \donttest{
# Example compositional data matrix
Z <- matrix(c(1, 2, 0, 4, 5, 6, 0, 8, 9), nrow = 3, byrow = TRUE)
transformed_Z <- mclr.transform(Z, base = 10, eps = 0.1)
# }

Run the code above in your browser using DataLab