This class is a wrapper for a series of differential abundance test and indicator analysis methods, including non-parametric Kruskal-Wallis Rank Sum Test, Dunn's Kruskal-Wallis Multiple Comparisons based on the FSA package, LEfSe based on the Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>, random forest <doi:10.1016/j.geoderma.2018.09.035>, metastat based on White et al. (2009) <doi:10.1371/journal.pcbi.1000352> and the method in R package metagenomeSeq Paulson et al. (2013) <doi:10.1038/nmeth.2658>.
Authors: Chi Liu, Yang Cao, Chenhao Li
new()trans_diff$new(
dataset = NULL,
method = c("lefse", "rf", "KW", "KW_dunn", "metastat", "mseq")[1],
group = NULL,
taxa_level = "all",
filter_thres = 0,
lefse_subgroup = NULL,
alpha = 0.05,
lefse_min_subsam = 10,
lefse_norm = 1e+06,
nresam = 0.6667,
boots = 30,
rf_ntree = 1000,
metastat_taxa_level = "Genus",
group_choose_paired = NULL,
mseq_adjustMethod = "fdr",
mseq_count = 1,
...
)datasetthe object of microtable Class.
methoddefault "lefse"; one of "lefse", "rf", "KW", "KW_dunn", "metastat" or "mseq"; see the following details:
from Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>
random forest, from An et al. (2019) <doi:10.1016/j.geoderma.2018.09.035>
Kruskal-Wallis Rank Sum Test (groups > 2) or Wilcoxon Rank Sum Tests (groups = 2) for a specific taxonomic level or all levels of microtable$taxa_abund
Dunn's Kruskal-Wallis Multiple Comparisons based on the FSA package
White et al. (2009) <doi:10.1371/journal.pcbi.1000352>
the method based on the zero-inflated log-normal model in metagenomeSeq package.
groupdefault NULL; sample group used for main comparision.
taxa_leveldefault "all"; use abundance data at all taxonomic ranks; For testing at a specific rank, provide taxonomic rank name, such as "Genus".
filter_thresdefault 0; the relative abundance threshold used for method = "lefse" or "rf".
lefse_subgroupdefault NULL; sample sub group used for sub-comparision in lefse; Segata et al. (2011) <doi:10.1186/gb-2011-12-6-r60>.
alphadefault .05; significance threshold.
lefse_min_subsamdefault 10; sample numbers required in the subgroup test.
lefse_normdefault 1000000; scale value in lefse.
nresamdefault .6667; sample number ratio used in each bootstrap or LEfSe or random forest.
bootsdefault 30; bootstrap test number for lefse or rf.
rf_ntreedefault 1000; see ntree in randomForest function of randomForest package.
metastat_taxa_leveldefault "Genus"; taxonomic rank level used in metastat test; White et al. (2009) <doi:10.1371/journal.pcbi.1000352>.
group_choose_paireddefault NULL; a vector used for selecting the required groups for paired testing, only used for metastat or mseq.
mseq_adjustMethoddefault "fdr"; Method to adjust p-values by. Default is "fdr". Options include "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none".
mseq_countdefault 1; Filter features to have at least 'counts' counts.; see the count parameter in MRcoefs function of metagenomeSeq package.
...parameters passed to kruskal.test function or wilcox.test function (method = "KW") or dunnTest function of FSA package (method = "KW_dunn").
res_rf, res_lefse, res_abund, res_metastat, or res_mseq in trans_diff object, depending on the method.
\donttest{
data(dataset)
t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group")
}
plot_diff_abund()Plotting the abundance of differential taxa.
trans_diff$plot_diff_abund( method = NULL, only_abund_plot = TRUE, use_number = 1:10, color_values = RColorBrewer::brewer.pal(8, "Dark2"), plot1_bar_color = "grey50", plot2_sig_color = "red", plot2_sig_size = 1.2, axis_text_y = 10, simplify_names = TRUE, keep_prefix = TRUE, group_order = NULL, plot2_barwidth = 0.9, add_significance = TRUE, use_se = TRUE )
methoddefault NULL; "rf" or "lefse"; automatically check the method in the result.
only_abund_plotdefault TRUE; if true, return only abundance plot; if false, return both indicator plot and abundance plot
use_numberdefault 1:10; vector, the taxa numbers used in the plot, 1:n.
color_valuescolors for presentation.
plot1_bar_colordefault "grey30"; the color for the plot 1.
plot2_sig_colordefault "red"; the color for the significance in plot 2.
plot2_sig_sizedefault 1.5; the size for the significance in plot 2.
axis_text_ydefault 12; the size for the y axis text.
simplify_namesdefault TRUE; whether use the simplified taxonomic name.
keep_prefixdefault TRUE; whether retain the taxonomic prefix.
group_orderdefault NULL; a vector to order the legend in plot.
plot2_barwidthdefault .9; the bar width in plot 2.
add_significancedefault TRUE; whether add the significance asterisk; only available when only_abund_plot FALSE.
use_sedefault TRUE; whether use SE in plot 2, if FALSE, use SD.
ggplot.
\donttest{
t1$plot_diff_abund(use_number = 1:10)
}
plot_lefse_bar()Bar plot for LDA score.
trans_diff$plot_lefse_bar( use_number = 1:10, color_values = RColorBrewer::brewer.pal(8, "Dark2"), LDA_score = NULL, simplify_names = TRUE, keep_prefix = TRUE, group_order = NULL, axis_text_y = 12, plot_vertical = TRUE, ... )
use_numberdefault 1:10; vector, the taxa numbers used in the plot, 1:n.
color_valuescolors for presentation.
LDA_scoredefault NULL; numeric value as the threshold, such as 2, limited with use_number.
simplify_namesdefault TRUE; whether use the simplified taxonomic name.
keep_prefixdefault TRUE; whether retain the taxonomic prefix.
group_orderdefault NULL; a vector to order the legend in plot.
axis_text_ydefault 12; the size for the y axis text.
plot_verticaldefault TRUE; whether use vertical bar plot or horizontal.
...parameters pass to geom_bar
ggplot.
\donttest{
t1$plot_lefse_bar(LDA_score = 4)
}
plot_lefse_cladogram()Plot the cladogram for LEfSe result similar with the python version. Codes are modified from microbiomeMarker
trans_diff$plot_lefse_cladogram( color = RColorBrewer::brewer.pal(8, "Dark2"), use_taxa_num = 200, filter_taxa = NULL, use_feature_num = NULL, group_order = NULL, clade_label_level = 4, select_show_labels = NULL, only_select_show = FALSE, sep = "|", branch_size = 0.2, alpha = 0.2, clade_label_size = 2, clade_label_size_add = 5, clade_label_size_log = exp(1), node_size_scale = 1, node_size_offset = 1, annotation_shape = 22, annotation_shape_size = 5 )
colordefault RColorBrewer::brewer.pal(8, "Dark2"); color used in the plot.
use_taxa_numdefault 200; integer; The taxa number used in the background tree plot; select the taxa according to the mean abundance
filter_taxadefault NULL; The mean relative abundance used to filter the taxa with low abundance
use_feature_numdefault NULL; integer; The feature number used in the plot; select the features according to the LDA score
group_orderdefault NULL; a vector to order the legend in plot.
clade_label_leveldefault 4; the taxonomic level for marking the label with letters, root is the largest
select_show_labelsdefault NULL; character vector; The features to show in the plot with full label names, not the letters
only_select_showdefault FALSE; whether only use the the select features in the parameter select_show_labels
sepdefault "|"; the seperate character in the taxonomic information
branch_sizedefault 0.2; numberic, size of branch
alphadefault 0.2; shading of the color
clade_label_sizedefault 2; basic size for the clade label; please also see clade_label_size_add and clade_label_size_log
clade_label_size_adddefault 5; added basic size for the clade label; see the formula in clade_label_size_log parameter.
clade_label_size_logdefault exp(1); the base of log function for added size of the clade label; the size formula: clade_label_size + log(clade_label_level + clade_label_size_add, base = clade_label_size_log); so use clade_label_size_log, clade_label_size_add and clade_label_size can totally control the label size for different taxonomic levels.
node_size_scaledefault 1; scale for the node size
node_size_offsetdefault 1; offset for the node size
annotation_shapedefault 22; shape used in the annotation legend
annotation_shape_sizedefault 5; size used in the annotation legend
ggplot.
\donttest{
t1$plot_lefse_cladogram(use_taxa_num = 100, use_feature_num = 30, select_show_labels = NULL)
}
plot_metastat()Bar plot for metastat.
trans_diff$plot_metastat( use_number = 1:10, color_values = RColorBrewer::brewer.pal(8, "Dark2"), qvalue = 0.05, choose_group = 1 )
use_numberdefault 1:10; vector, the taxa numbers used in the plot, 1:n.
color_valuescolors for presentation.
qvaluedefault .05; numeric value as the threshold of q value.
choose_groupdefault 1; which column in res_metastat_group_matrix will be used.
ggplot.
\donttest{
t1 <- trans_diff$new(dataset = dataset, method = "metastat", group = "Group")
t1$plot_metastat(use_number = 1:10, qvalue = 0.05, choose_group = 1)
}
print()Print the trans_diff object.
trans_diff$print()
clone()The objects of this class are cloneable with this method.
trans_diff$clone(deep = FALSE)
deepWhether to make a deep clone.
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$new`
## ------------------------------------------------
# }
# NOT RUN {
data(dataset)
t1 <- trans_diff$new(dataset = dataset, method = "lefse", group = "Group")
# }
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$plot_diff_abund`
## ------------------------------------------------
# }
# NOT RUN {
t1$plot_diff_abund(use_number = 1:10)
# }
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$plot_lefse_bar`
## ------------------------------------------------
# }
# NOT RUN {
t1$plot_lefse_bar(LDA_score = 4)
# }
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$plot_lefse_cladogram`
## ------------------------------------------------
# }
# NOT RUN {
t1$plot_lefse_cladogram(use_taxa_num = 100, use_feature_num = 30, select_show_labels = NULL)
# }
# NOT RUN {
## ------------------------------------------------
## Method `trans_diff$plot_metastat`
## ------------------------------------------------
# }
# NOT RUN {
t1 <- trans_diff$new(dataset = dataset, method = "metastat", group = "Group")
t1$plot_metastat(use_number = 1:10, qvalue = 0.05, choose_group = 1)
# }
Run the code above in your browser using DataLab