microtable: Create `microtable` object to store and manage all the basic files.

Description

This class is a wrapper for a series of operations on the input files and basic manipulations, including microtable object creation, data trimming, data filtering, rarefaction based on Paul et al. (2013) <doi:10.1371/journal.pone.0061217>, taxonomic abundance calculation, alpha and beta diversity calculation based on the An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035> and Lozupone et al. (2005) <doi:10.1128/AEM.71.12.8228-8235.2005> and other basic operations.

Online tutorial: https://chiliubio.github.io/microeco_tutorial/
Download tutorial: https://github.com/ChiLiubio/microeco_tutorial/releases

Arguments

Format

microtable.

Methods

Method `new()`

Usage

microtable$new(
  otu_table,
  sample_table = NULL,
  tax_table = NULL,
  phylo_tree = NULL,
  rep_fasta = NULL,
  auto_tidy = FALSE
)

Arguments

otu_table: data.frame; The feature abundance table; rownames are features (e.g. OTUs/ASVs/species/genes); column names are samples.

sample_table

data.frame; default NULL; The sample information table; rownames are samples; columns are sample metadata; If not provided, the function can generate a table automatically according to the sample names in otu_table.

tax_table

data.frame; default NULL; The taxonomic information table; rownames are features; column names are taxonomic classes.

phylo_tree

phylo; default NULL; The phylogenetic tree; use read.tree function in ape package for input.

rep_fasta

DNAStringSet or list format; default NULL; The representative sequences; use read.fasta function in seqinr package or readDNAStringSet function in Biostrings package for input.

auto_tidy

default FALSE; Whether trim the files in the microtable object automatically; If TRUE, running the functions in microtable class can invoke the tidy_dataset function automatically.

Returns

an object of class microtable with the following components:

sample_table: The sample information table.
otu_table: The feature table.
tax_table: The taxonomic table.
phylo_tree: The phylogenetic tree.
rep_fasta: The representative sequence.
taxa_abund: default NULL; use cal_abund function to calculate.
alpha_diversity: default NULL; use cal_alphadiv function to calculate.
beta_diversity: default NULL; use cal_betadiv function to calculate.

Examples

data(otu_table_16S)
data(taxonomy_table_16S)
data(sample_info_16S)
data(phylo_tree_16S)
m1 <- microtable$new(otu_table = otu_table_16S)
m1 <- microtable$new(sample_table = sample_info_16S, otu_table = otu_table_16S, 
  tax_table = taxonomy_table_16S, phylo_tree = phylo_tree_16S)
# trim the files in the dataset
m1$tidy_dataset()

Method `filter_pollution()`

Filter the features considered pollution in microtable$tax_table. This operation will remove any line of the microtable$tax_table containing any the word in taxa parameter regardless of word case.

Usage

microtable$filter_pollution(taxa = c("mitochondria", "chloroplast"))

Arguments

taxa: default c("mitochondria", "chloroplast"); filter mitochondria and chloroplast, or others as needed.

Returns

None

Examples

m1$filter_pollution(taxa = c("mitochondria", "chloroplast"))

Method `filter_taxa()`

Filter the feature with low abundance and/or low occurrence frequency.

Usage

microtable$filter_taxa(rel_abund = 0, freq = 1, include_lowest = TRUE)

Arguments

rel_abund: default 0; the relative abundance threshold, such as 0.0001.

freq

default 1; the occurrence frequency threshold. For example, the number 2 represents filtering the feature that occurs less than 2 times. A number smaller than 1 is also allowable. For instance, the number 0.1 represents filtering the feature that occurs in less than 10% samples.

include_lowest

default TRUE; whether include the feature with the threshold.

Returns

None

Examples

\donttest{
d1 <- clone(m1)
d1$filter_taxa(rel_abund = 0.0001, freq = 0.2)
}

Method `rarefy_samples()`

Rarefy communities to make all samples have same feature number.

Usage

microtable$rarefy_samples(
  method = c("rarefying", "SRS")[1],
  sample.size = NULL,
  rngseed = 123,
  replace = TRUE
)

Arguments

method: default c("rarefying", "SRS")[1]; "rarefying" represents the classical resampling like rrarefy function of vegan package. "SRS" is scaling with ranked subsampling method based on the SRS package provided by Lukas Beule and Petr Karlovsky (2020) <DOI:10.7717/peerj.9593>.

sample.size

default NULL; feature number, If not provided, use minimum number of all samples.

rngseed

default 123; random seed.

replace

default TRUE; see sample for the random sampling; Available when method = "rarefying".

Returns

None; rarefied dataset.

Examples

\donttest{
m1$rarefy_samples(sample.size = min(m1$sample_sums()), replace = TRUE)
}

Method `tidy_dataset()`

Trim all the data in the microtable object to make taxa and samples consistent. So the results are intersections.

Usage

microtable$tidy_dataset(main_data = FALSE)

Arguments

main_data: default FALSE; if TRUE, only basic data in microtable object is trimmed. Otherwise, all data, including taxa_abund, alpha_diversity and beta_diversity, are all trimed.

Returns

None, object of microtable itself cleaned up.

Examples

m1$tidy_dataset(main_data = TRUE)

Method `add_rownames2taxonomy()`

Add the rownames of microtable$tax_table as its last column. This is especially useful when the rownames of microtable$tax_table are required as a taxonomic level for the taxonomic abundance calculation and biomarker idenfification.

Usage

microtable$add_rownames2taxonomy(use_name = "OTU")

Arguments

use_name: default "OTU"; The column name used in the tax_table.

Returns

NULL, a new tax_table stored in the object.

Examples

\donttest{
m1$add_rownames2taxonomy()
}

Method `sample_sums()`

Sum the species number for each sample.

Usage

microtable$sample_sums()

Returns

species number of samples.

Examples

\donttest{
m1$sample_sums()
}

Method `taxa_sums()`

Sum the species number for each taxa.

Usage

microtable$taxa_sums()

Returns

species number of taxa.

Examples

\donttest{
m1$taxa_sums()
}

Method `sample_names()`

Show sample names.

Usage

microtable$sample_names()

Returns

sample names.

Examples

\donttest{
m1$sample_names()
}

Method `taxa_names()`

Show taxa names of tax_table.

Usage

microtable$taxa_names()

Returns

taxa names.

Examples

\donttest{
m1$taxa_names()
}

Method `rename_taxa()`

Rename the features, including the rownames of otu_table, rownames of tax_table, tip labels of phylo_tree and rep_fasta.

Usage

microtable$rename_taxa(newname_prefix = "ASV_")

Arguments

newname_prefix: default "ASV_"; the prefix of new names; new names will be newname_prefix + numbers according to the rownames order of otu_table.

Returns

None; renamed dataset.

Examples

\donttest{
m1$rename_taxa()
}

Method `merge_samples()`

Merge samples according to specific group to generate a new microtable.

Usage

microtable$merge_samples(use_group)

Arguments

use_group: the group column in sample_table.

Returns

a new merged microtable object.

Examples

\donttest{
m1$merge_samples(use_group = "Group")
}

Method `merge_taxa()`

Merge taxa according to specific taxonomic rank to generate a new microtable.

Usage

microtable$merge_taxa(taxa = "Genus")

Arguments

taxa: default "Genus"; the specific rank in tax_table.

Returns

a new merged microtable object.

Examples

\donttest{
m1$merge_taxa(taxa = "Genus")
}

Method `save_table()`

Save each basic data in microtable object as local file.

Usage

microtable$save_table(dirpath = "basic_files", sep = ",", ...)

Arguments

dirpath: default "basic_files"; directory to save the tables, phylogenetic tree and sequences in microtable object. It will be created if not found.

sep

default ","; the field separator string, used to save tables. Same with sep parameter in write.table function. default ',' correspond to the file name suffix 'csv'. The option '\t' correspond to the file name suffix 'tsv'. For other options, suffix are all 'txt'.

...

parameters passed to write.table.

Examples

\dontrun{
m1$save_table()
}

Method `cal_abund()`

Calculate the taxonomic abundance at each taxonomic level or selected levels.

Usage

microtable$cal_abund(
  select_cols = NULL,
  rel = TRUE,
  merge_by = "|",
  split_group = FALSE,
  split_by = "&&",
  split_column = NULL
)

Arguments

select_cols: default NULL; numeric vector or character vector of colnames of microtable$tax_table; applied to select columns to merge and calculate abundances according to ordered hierarchical levels. This is very useful if there are commented columns or some columns with multiple structure that cannot be used directly.

rel

default TRUE; if TRUE, relative abundance is used; if FALSE, absolute abundance (i.e. raw values) will be summed.

merge_by

default "|"; the symbol to merge and concatenate taxonomic names of different levels.

split_group

default FALSE; if TRUE, split the rows to multiple rows according to one or more columns in tax_table. Very useful when multiple mapping information exist.

split_by

default "&&"; Separator delimiting collapsed values; only useful when split_group = TRUE; see sep parameter in separate_rows function of tidyr package.

split_column

default NULL; character vector or list; only useful when split_group = TRUE; character vector: fixed column or columns used for the splitting in tax_table for each abundance calculation; list: containing more character vectors to assign the column names to each calculation, such as list(c("Phylum"), c("Phylum", "Class")).

Returns

taxa_abund list in object.

Examples

\donttest{
m1$cal_abund()
}

Method `save_abund()`

Save taxonomic abundance as local file.

Usage

microtable$save_abund(
  dirpath = "taxa_abund",
  merge_all = FALSE,
  rm_un = FALSE,
  rm_pattern = "__$",
  sep = ",",
  ...
)

Arguments

dirpath: default "taxa_abund"; directory to save the taxonomic abundance files. It will be created if not found.

merge_all

default FALSE; Whether merge all tables into one. The merged file format is generally called 'mpa' style.

rm_un

default FALSE; Whether remove unclassified taxa in which the name ends with '__' generally.

rm_pattern

default "__$"; The pattern searched through the merged taxonomic names. See also pattern parameter in grepl function. Only available when rm_un = TRUE. The default "__$" means removing the names end with '__'.

sep

default ","; the field separator string. Same with sep parameter in write.table function. default ',' correspond to the file name suffix 'csv'. The option '\t' correspond to the file name suffix 'tsv'. For other options, suffix are all 'txt'.

...

parameters passed to write.table.

Examples

\dontrun{
m1$save_abund(dirpath = "taxa_abund")
m1$save_abund(merge_all = TRUE, rm_un = TRUE, sep = "\t")
}

Method `cal_alphadiv()`

Calculate alpha diversity.

Usage

microtable$cal_alphadiv(measures = NULL, PD = FALSE)

Arguments

measures: default NULL; one or more indexes of c("Observed", "Coverage", "Chao1", "ACE", "Shannon", "Simpson", "InvSimpson", "Fisher", "PD"); If null, use all those measures. 'Shannon', 'Simpson' and 'InvSimpson' are calculated based on vegan::diversity function; 'Chao1' and 'ACE' depend on the function vegan::estimateR; 'PD' depends on the function picante::pd.

PD

default FALSE; whether Faith's phylogenetic diversity should be calculated.

Returns

alpha_diversity stored in object.

Examples

\donttest{
m1$cal_alphadiv(measures = NULL, PD = FALSE)
class(m1$alpha_diversity)
}

Method `save_alphadiv()`

Save alpha diversity table to the computer.

Usage

microtable$save_alphadiv(dirpath = "alpha_diversity")

Arguments

dirpath: default "alpha_diversity"; directory name to save the alpha_diversity.csv file.

Method `cal_betadiv()`

Calculate beta diversity, including Bray-Curtis, Jaccard, and UniFrac. See An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035> and Lozupone et al. (2005) <doi:10.1128/AEM.71.12.8228–8235.2005>.

Usage

microtable$cal_betadiv(method = NULL, unifrac = FALSE, binary = FALSE, ...)

Arguments

method: default NULL; a character vector with one or more elements; If default, "bray" and "jaccard" will be used; see vegdist function and method parameter in vegan package.

unifrac

default FALSE; whether UniFrac index should be calculated.

binary

default FALSE; TRUE is used for jaccard and unweighted unifrac; optional for other indexes.

...

parameters passed to vegdist function.

Returns

beta_diversity list stored in object.

Examples

\donttest{
m1$cal_betadiv(unifrac = FALSE)
class(m1$beta_diversity)
}

Method `save_betadiv()`

Save beta diversity matrix to the computer.

Usage

microtable$save_betadiv(dirpath = "beta_diversity")

Arguments

dirpath: default "beta_diversity"; directory name to save the beta diversity matrix files.

Method `print()`

Print the microtable object.

Usage

microtable$print()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

microtable$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

Run this code


## ------------------------------------------------
## Method `microtable$new`
## ------------------------------------------------

data(otu_table_16S)
data(taxonomy_table_16S)
data(sample_info_16S)
data(phylo_tree_16S)
m1 <- microtable$new(otu_table = otu_table_16S)
m1 <- microtable$new(sample_table = sample_info_16S, otu_table = otu_table_16S, 
  tax_table = taxonomy_table_16S, phylo_tree = phylo_tree_16S)
# trim the files in the dataset
m1$tidy_dataset()

## ------------------------------------------------
## Method `microtable$filter_pollution`
## ------------------------------------------------

m1$filter_pollution(taxa = c("mitochondria", "chloroplast"))

## ------------------------------------------------
## Method `microtable$filter_taxa`
## ------------------------------------------------

# \donttest{
d1 <- clone(m1)
d1$filter_taxa(rel_abund = 0.0001, freq = 0.2)
# }

## ------------------------------------------------
## Method `microtable$rarefy_samples`
## ------------------------------------------------

# \donttest{
m1$rarefy_samples(sample.size = min(m1$sample_sums()), replace = TRUE)
# }

## ------------------------------------------------
## Method `microtable$tidy_dataset`
## ------------------------------------------------

m1$tidy_dataset(main_data = TRUE)

## ------------------------------------------------
## Method `microtable$add_rownames2taxonomy`
## ------------------------------------------------

# \donttest{
m1$add_rownames2taxonomy()
# }

## ------------------------------------------------
## Method `microtable$sample_sums`
## ------------------------------------------------

# \donttest{
m1$sample_sums()
# }

## ------------------------------------------------
## Method `microtable$taxa_sums`
## ------------------------------------------------

# \donttest{
m1$taxa_sums()
# }

## ------------------------------------------------
## Method `microtable$sample_names`
## ------------------------------------------------

# \donttest{
m1$sample_names()
# }

## ------------------------------------------------
## Method `microtable$taxa_names`
## ------------------------------------------------

# \donttest{
m1$taxa_names()
# }

## ------------------------------------------------
## Method `microtable$rename_taxa`
## ------------------------------------------------

# \donttest{
m1$rename_taxa()
# }

## ------------------------------------------------
## Method `microtable$merge_samples`
## ------------------------------------------------

# \donttest{
m1$merge_samples(use_group = "Group")
# }

## ------------------------------------------------
## Method `microtable$merge_taxa`
## ------------------------------------------------

# \donttest{
m1$merge_taxa(taxa = "Genus")
# }

## ------------------------------------------------
## Method `microtable$save_table`
## ------------------------------------------------

if (FALSE) {
m1$save_table()
}

## ------------------------------------------------
## Method `microtable$cal_abund`
## ------------------------------------------------

# \donttest{
m1$cal_abund()
# }

## ------------------------------------------------
## Method `microtable$save_abund`
## ------------------------------------------------

if (FALSE) {
m1$save_abund(dirpath = "taxa_abund")
m1$save_abund(merge_all = TRUE, rm_un = TRUE, sep = "\t")
}

## ------------------------------------------------
## Method `microtable$cal_alphadiv`
## ------------------------------------------------

# \donttest{
m1$cal_alphadiv(measures = NULL, PD = FALSE)
class(m1$alpha_diversity)
# }

## ------------------------------------------------
## Method `microtable$cal_betadiv`
## ------------------------------------------------

# \donttest{
m1$cal_betadiv(unifrac = FALSE)
class(m1$beta_diversity)
# }

Run the code above in your browser using DataLab

Description

Arguments

Format

Methods

Public methods

Method new()

Usage

Arguments

Returns

Examples

Method filter_pollution()

Usage

Arguments

Returns

Examples

Method filter_taxa()

Usage

Arguments

Returns

Examples

Method rarefy_samples()

Usage

Arguments

Returns

Examples

Method tidy_dataset()

Usage

Arguments

Returns

Examples

Method add_rownames2taxonomy()

Usage

Arguments

Returns

Examples

Method sample_sums()

Usage

Returns

Examples

Method taxa_sums()

Usage

Returns

Examples

Method sample_names()

Usage

Returns

Examples

Method taxa_names()

Usage

Returns

Examples

Method rename_taxa()

Usage

Arguments

Returns

Examples

Method merge_samples()

Usage

Arguments

Returns

Examples

Method merge_taxa()

Usage

Arguments

Returns

Examples

Method save_table()

Usage

Arguments

Examples

Method cal_abund()

Usage

Arguments

Returns

Examples

Method save_abund()

Usage

Arguments

Examples

Method cal_alphadiv()

Method `new()`

Method `filter_pollution()`

Method `filter_taxa()`

Method `rarefy_samples()`

Method `tidy_dataset()`

Method `add_rownames2taxonomy()`

Method `sample_sums()`

Method `taxa_sums()`

Method `sample_names()`

Method `taxa_names()`

Method `rename_taxa()`

Method `merge_samples()`

Method `merge_taxa()`

Method `save_table()`

Method `cal_abund()`

Method `save_abund()`

Method `cal_alphadiv()`

Method `save_alphadiv()`

Method `cal_betadiv()`

Method `save_betadiv()`

Method `print()`

Method `clone()`