Learn R Programming

readyomics (version 0.1.1)

build_phyloseq: Build phyloseq objects for all taxonomy ranks

Description

Constructs a list of phyloseq objects from a feature matrix (X), sample data, taxonomy and (optionally) phylogenetic tree data.

Usage

build_phyloseq(
  X,
  sample_data,
  taxa_table = NULL,
  phylo_tree = NULL,
  taxa_in_rows,
  verbose = TRUE
)

Value

A named list of phyloseq objects and related output:

asv

Phyloseq object with the raw feature counts (usually ASVs).

<tax_rank>

Phyloseq objects of higher taxonomy ranks from taxa_table.

Arguments

X

A numeric matrix of NGS features (e.g., ASVs), with samples in rows and features in columns (recommended) or vice versa.

sample_data

A data.frame containing sample data. Row names must match sample identifiers in X.

taxa_table

(Optional) A taxonomy table with row names corresponding to feature names in X, and taxonomic ranks as columns.

phylo_tree

(Optional) A phylogenetic tree.

taxa_in_rows

Logical. If TRUE, X is assumed to have taxa as rows and samples as columns.

verbose

Logical. If TRUE, diagnostic messages will be printed.

Details

Phyloseq objects for higher taxonomic ranks are also generated when taxa_table is provided. Higher rank taxa with labels matching "unclass" or "unknown" are excluded after aggregation.

If very long strings are detected as feature IDs in X matrix or taxa_table, (for example when actual DNA sequence is used as ID), it will issue a warning, as this could significantly slow down computation and increase memory usage.

See Also

phyloseq::phyloseq() for further details on phyloseq objects.

Examples

Run this code
if (requireNamespace("phyloseq", quietly = TRUE)) {
mock_X <- matrix(c(10, 0, 5, 3, 1, 7),
                 nrow = 2, byrow = TRUE,
                 dimnames = list(c("sample1", "sample2"),
                                 c("ASV1", "ASV2", "ASV3"))
                 )

mock_sample_data <- data.frame(sample_id = c("sample1", "sample2"),
                               group = c("A", "B"),
                               row.names = c("sample1", "sample2")
                               )

mock_taxa_table <- data.frame(Domain = c("Bacteria", "Bacteria", "Bacteria"),
                              Genus = c("GenusA", "GenusB", "Unknown"),
                              row.names = c("ASV1", "ASV2", "ASV3")
                              )

phyloseq_ready <- build_phyloseq(X = mock_X,
                                 sample_data = mock_sample_data,
                                 taxa_table = mock_taxa_table,
                                 taxa_in_rows = FALSE,
                                 verbose = FALSE)
}

Run the code above in your browser using DataLab