Learn R Programming

metamorphr (version 0.2.0)

impute_ppca: Impute missing values using Probabilistic PCA

Description

One of several PCA-based imputation methods. Basically a wrapper around pcaMethods::pca(method = "ppca"). For a detailed discussion, see the vignette("pcaMethods") and vignette("missingValues", "pcaMethods") as well as the References section.
In the underlying function (pcaMethods::pca(method = "ppca")), the order of columns has an influence on the outcome. Therefore, calling pcaMethods::pca(method = "ppca") on a matrix and calling metamorphr::impute() on a tidy tibble might give different results, even though they contain the same data. That is because under the hood, the tibble is transformed to a matrix prior to calling pcaMethods::pca(method = "ppca") and you have limited influence on the column order of the resulting matrix.

Important Note

impute_ppca() depends on the pcaMethods package from Bioconductor. If metamorphr was installed via install.packages(), dependencies from Bioconductor were not automatically installed. When impute_ppca() is called without the pcaMethods package installed, you should be asked if you want to install pak and pcaMethods. If you want to use impute_ppca() you have to install those. In case you run into trouble with the automatic installation, please install pcaMethods manually. See pcaMethods – a Bioconductor package providing PCA methods for incomplete data for instructions on manual installation.

Usage

impute_ppca(
  data,
  n_pcs = 2,
  center = TRUE,
  scale = "none",
  direction = 2,
  random_seed = 1L
)

Value

A tibble with imputed missing values.

Arguments

data

A tidy tibble created by read_featuretable.

n_pcs

The number of PCs to calculate.

center

Should data be mean centered? See prep for details.

scale

Should data be scaled? See prep for details.

direction

Either 1 or 2. 1 runs a PCA on a matrix with samples in columns and features in rows and 2 runs a PCA on a matrix with features in columns and samples in rows. Both are valid according to this discussion on GitHub but give different results.

random_seed

An integer used as seed for the random number generator.

References

  • H. R. Wolfram Stacklies, 2017, DOI 10.18129/B9.BIOC.PCAMETHODS.

  • W. Stacklies, H. Redestig, M. Scholz, D. Walther, J. Selbig, Bioinformatics 2007, 23, 1164–1167, DOI 10.1093/bioinformatics/btm069.

Examples

Run this code
toy_metaboscape %>%
  impute_ppca()

Run the code above in your browser using DataLab