This class is a wrapper for a series of differential abundance test and indicator analysis methods, including non-parametric Kruskal-Wallis Rank Sum Test, Dunn's Kruskal-Wallis Multiple Comparisons based on the FSA package, LEfSe based on the Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>, random forest <doi:10.1016/j.geoderma.2018.09.035>, metastat based on White et al. (2009) <doi:10.1371/journal.pcbi.1000352> and the method in R package metagenomeSeq Paulson et al. (2013) <doi:10.1038/nmeth.2658>.
Authors: Chi Liu, Yang Cao, Chenhao Li
new()
trans_diff$new( dataset = NULL, method = c("lefse", "rf", "KW", "KW_dunn", "metastat", "mseq")[1], group = NULL, taxa_level = "all", filter_thres = 0, lefse_subgroup = NULL, alpha = 0.05, lefse_min_subsam = 10, lefse_norm = 1e+06, nresam = 0.6667, boots = 30, rf_ntree = 1000, metastat_taxa_level = "Genus", group_choose_paired = NULL, mseq_adjustMethod = "fdr", mseq_count = 1, ... )
dataset
the object of microtable
Class.
method
default "lefse"; one of "lefse", "rf", "KW", "KW_dunn", "metastat" or "mseq"; see the following details:
from Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>
random forest, from An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035>
Kruskal-Wallis Rank Sum Test (groups > 2) or Wilcoxon Rank Sum Tests (groups = 2) for a specific taxonomic level or all levels of microtable$taxa_abund
Dunn's Kruskal-Wallis Multiple Comparisons based on the FSA package
White et al. (2009) <doi:10.1371/journal.pcbi.1000352>
the method based on the zero-inflated log-normal model in metagenomeSeq package.
group
default NULL; sample group used for main comparision.
taxa_level
default "all"; use abundance data at all taxonomic ranks; For testing at a specific rank, provide taxonomic rank name, such as "Genus".
filter_thres
default 0; the relative abundance threshold used for method = "lefse" or "rf".
lefse_subgroup
default NULL; sample sub group used for sub-comparision in lefse; Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>.
alpha
default .05; significance threshold.
lefse_min_subsam
default 10; sample numbers required in the subgroup test.
lefse_norm
default 1000000; scale value in lefse.
nresam
default .6667; sample number ratio used in each bootstrap or LEfSe or random forest.
boots
default 30; bootstrap test number for lefse or rf.
rf_ntree
default 1000; see ntree in randomForest function of randomForest package.
metastat_taxa_level
default "Genus"; taxonomic rank level used in metastat test; White et al. (2009) <doi:10.1371/journal.pcbi.1000352>.
group_choose_paired
default NULL; a vector used for selecting the required groups for paired testing, only used for metastat or mseq.
mseq_adjustMethod
default "fdr"; Method to adjust p-values by. Default is "fdr". Options include "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
mseq_count
default 1; Filter features to have at least 'counts' counts.; see the count parameter in MRcoefs function of metagenomeSeq package.
...
parameters passed to kruskal.test function or wilcox.test function (method = "KW") or dunnTest function of FSA package (method = "KW_dunn").
res_rf, res_lefse, res_abund, res_metastat, or res_mseq in trans_diff object, depending on the method.
\donttest{ data(dataset) t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group") }
plot_diff_abund()
Plotting the abundance of differential taxa.
trans_diff$plot_diff_abund( method = NULL, only_abund_plot = TRUE, use_number = 1:10, color_values = RColorBrewer::brewer.pal(8, "Dark2"), plot1_bar_color = "grey50", plot2_sig_color = "red", plot2_sig_size = 1.2, axis_text_y = 10, simplify_names = TRUE, keep_prefix = TRUE, group_order = NULL, plot2_barwidth = 0.9, add_significance = TRUE, use_se = TRUE )
method
default NULL; "rf" or "lefse"; automatically check the method in the result.
only_abund_plot
default TRUE; if true, return only abundance plot; if false, return both indicator plot and abundance plot
use_number
default 1:10; vector, the taxa numbers used in the plot, 1:n.
color_values
colors for presentation.
plot1_bar_color
default "grey30"; the color for the plot 1.
plot2_sig_color
default "red"; the color for the significance in plot 2.
plot2_sig_size
default 1.5; the size for the significance in plot 2.
axis_text_y
default 12; the size for the y axis text.
simplify_names
default TRUE; whether use the simplified taxonomic name.
keep_prefix
default TRUE; whether retain the taxonomic prefix.
group_order
default NULL; a vector to order the legend in plot.
plot2_barwidth
default .9; the bar width in plot 2.
add_significance
default TRUE; whether add the significance asterisk; only available when only_abund_plot FALSE.
use_se
default TRUE; whether use SE in plot 2, if FALSE, use SD.
ggplot.
\donttest{ t1$plot_diff_abund(use_number = 1:10) }
plot_lefse_bar()
Bar plot for LDA score.
trans_diff$plot_lefse_bar( use_number = 1:10, color_values = RColorBrewer::brewer.pal(8, "Dark2"), LDA_score = NULL, simplify_names = TRUE, keep_prefix = TRUE, group_order = NULL, axis_text_y = 12, plot_vertical = TRUE, ... )
use_number
default 1:10; vector, the taxa numbers used in the plot, 1:n.
color_values
colors for presentation.
LDA_score
default NULL; numeric value as the threshold, such as 2, limited with use_number.
simplify_names
default TRUE; whether use the simplified taxonomic name.
keep_prefix
default TRUE; whether retain the taxonomic prefix.
group_order
default NULL; a vector to order the legend in plot.
axis_text_y
default 12; the size for the y axis text.
plot_vertical
default TRUE; whether use vertical bar plot or horizontal.
...
parameters pass to geom_bar
ggplot.
\donttest{ t1$plot_lefse_bar(LDA_score = 4) }
plot_lefse_cladogram()
Plot the cladogram for LEfSe result similar with the python version. Codes are modified from microbiomeMarker
trans_diff$plot_lefse_cladogram( color = RColorBrewer::brewer.pal(8, "Dark2"), use_taxa_num = 200, filter_taxa = NULL, use_feature_num = NULL, group_order = NULL, clade_label_level = 4, select_show_labels = NULL, only_select_show = FALSE, sep = "|", branch_size = 0.2, alpha = 0.2, clade_label_size = 2, clade_label_size_add = 5, clade_label_size_log = exp(1), node_size_scale = 1, node_size_offset = 1, annotation_shape = 22, annotation_shape_size = 5 )
color
default RColorBrewer::brewer.pal(8, "Dark2"); color used in the plot.
use_taxa_num
default 200; integer; The taxa number used in the background tree plot; select the taxa according to the mean abundance
filter_taxa
default NULL; The mean relative abundance used to filter the taxa with low abundance
use_feature_num
default NULL; integer; The feature number used in the plot; select the features according to the LDA score
group_order
default NULL; a vector to order the legend in plot.
clade_label_level
default 4; the taxonomic level for marking the label with letters, root is the largest
select_show_labels
default NULL; character vector; The features to show in the plot with full label names, not the letters
only_select_show
default FALSE; whether only use the the select features in the parameter select_show_labels
sep
default "|"; the seperate character in the taxonomic information
branch_size
default 0.2; numberic, size of branch
alpha
default 0.2; shading of the color
clade_label_size
default 2; basic size for the clade label; please also see clade_label_size_add and clade_label_size_log
clade_label_size_add
default 5; added basic size for the clade label; see the formula in clade_label_size_log parameter.
clade_label_size_log
default exp(1); the base of log function for added size of the clade label; the size formula: clade_label_size + log(clade_label_level + clade_label_size_add, base = clade_label_size_log); so use clade_label_size_log, clade_label_size_add and clade_label_size can totally control the label size for different taxonomic levels.
node_size_scale
default 1; scale for the node size
node_size_offset
default 1; offset for the node size
annotation_shape
default 22; shape used in the annotation legend
annotation_shape_size
default 5; size used in the annotation legend
ggplot.
\donttest{ t1$plot_lefse_cladogram(use_taxa_num = 100, use_feature_num = 30, select_show_labels = NULL) }
plot_metastat()
Bar plot for metastat.
trans_diff$plot_metastat( use_number = 1:10, color_values = RColorBrewer::brewer.pal(8, "Dark2"), qvalue = 0.05, choose_group = 1 )
use_number
default 1:10; vector, the taxa numbers used in the plot, 1:n.
color_values
colors for presentation.
qvalue
default .05; numeric value as the threshold of q value.
choose_group
default 1; which column in res_metastat_group_matrix will be used.
ggplot.
\donttest{ t1 <- trans_diff$new(dataset = dataset, method = "metastat", group = "Group") t1$plot_metastat(use_number = 1:10, qvalue = 0.05, choose_group = 1) }
print()
Print the trans_diff object.
trans_diff$print()
clone()
The objects of this class are cloneable with this method.
trans_diff$clone(deep = FALSE)
deep
Whether to make a deep clone.
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$new`
## ------------------------------------------------
# }
# NOT RUN {
data(dataset)
t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group")
# }
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$plot_diff_abund`
## ------------------------------------------------
# }
# NOT RUN {
t1$plot_diff_abund(use_number = 1:10)
# }
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$plot_lefse_bar`
## ------------------------------------------------
# }
# NOT RUN {
t1$plot_lefse_bar(LDA_score = 4)
# }
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$plot_lefse_cladogram`
## ------------------------------------------------
# }
# NOT RUN {
t1$plot_lefse_cladogram(use_taxa_num = 100, use_feature_num = 30, select_show_labels = NULL)
# }
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$plot_metastat`
## ------------------------------------------------
# }
# NOT RUN {
t1 <- trans_diff$new(dataset = dataset, method = "metastat", group = "Group")
t1$plot_metastat(use_number = 1:10, qvalue = 0.05, choose_group = 1)
# }
Run the code above in your browser using DataLab