metacoder v0.3.3
Monthly downloads
Tools for Parsing, Manipulating, and Graphing Taxonomic Abundance Data
A set of tools for parsing, manipulating, and graphing data
classified by a hierarchy (e.g. a taxonomy).
Readme
An R package for metabarcoding research planning and analysis
Metacoder is an R package for reading, plotting, and manipulating large taxonomic data sets, like those generated from modern high-throughput sequencing, like metabarcoding (i.e. amplification metagenomics, 16S metagenomics, etc). It provides a tree-based visualization called "heat trees" used to depict statistics for every taxon in a taxonomy using color and size. It also provides various functions to do common tasks in microbiome bioinformatics on data in the taxmap
format defined by the taxa
package, such as:
- Summing read counts/abundance per taxon
- Converting counts to proportions and rarefaction of counts using
vegan
- Comparing the abundance (or other characteristics) of groups of samples (e.g., experimental treatments) per taxon
- Combining data for groups of samples
- Simulated PCR, via EMBOSS primersearch, for testing primer specificity and coverage of taxonomic groups
- Converting common microbiome formats for data and reference databases into the objects defined by the
taxa
package. - Converting to and from the
phyloseq
format and thetaxa
format
Installation
This project is available on CRAN and can be installed like so:
install.packages("metacoder")
You can also install the development version for the newest features, bugs, and bug fixes:
install.packages("devtools")
devtools::install_github("grunwaldlab/metacoder")
Documentation
All the documentation for metacoder
can be found on our website here:
https://grunwaldlab.github.io/metacoder_documentation/
Dependencies
The function that simulates PCR requires primersearch
from the EMBOSS tool kit to be installed. This is not an R package, so it is not automatically installed. Type ?primersearch
after installing and loading metacoder for installation instructions.
Relationship with other packages
Many of these operations can be done using other packages like phyloseq
, which also provides tools for diversity analysis. The main strength of metacoder
is that its functions use the flexible data types defined by taxa
, which has powerful parsing and subsetting abilities that take into account the hierarchical relationship between taxa and user-defined data. In general, metacoder
and taxa
are more of an abstracted tool kit, whereas phyloseq
has more specialized functions for community diversity data, but they both can do similar things. I encourage you to try both to see which fits your needs and style best. You can also combine the two in a single analysis by converting between the two data types when needed.
Citation
If you use metcoder in a publication, please cite our article in PLOS Computational Biology:
Foster ZSL, Sharpton TJ, Grünwald NJ (2017) Metacoder: An R package for visualization and manipulation of community taxonomic diversity data. PLOS Computational Biology 13(2): e1005404. https://doi.org/10.1371/journal.pcbi.1005404
Future development
Metacoder is under active development and many new features are planned. Some improvements that are being explored include:
- Barcoding gap analysis and associated plotting functions
- A function to aid in retrieving appropriate sequence data from NCBI for in silico PCR from whole genome sequences
- Graphing of different node shapes in heat trees, possibly including pie graphs or PhyloPics.
- Adding the ability to plot specific edge lengths in the heat trees so they can be used for phylogenetic trees.
- Adding more data import and export functions to make parsing and writing common formats easier.
To see the details of what is being worked on, check out the issues tab of the Metacoder Github site.
License
This work is subject to the MIT License.
Acknowledgements
Metacoder's major dependencies are taxa
, taxize
, vegan
, igraph
, dplyr
, and ggplot2
.
This package includes code from the R package ggrepel to handle label overlap avoidance with permission from the author of ggrepel Kamil Slowikowski. We included the code instead of depending on ggrepel
because we are using functions internal to ggrepel
that might change in the future. We thank Kamil Slowikowski for letting us use his code and would like to acknowledge his implementation of the label overlap avoidance used in metacoder.
Feedback and contributions
We would like to hear about users' thoughts on the package and any errors they run into. Please report errors, questions or suggestions on the issues tab of the Metacoder Github site. We also welcome contributions via a Github pull request. You can also talk with us using our Google groups site.
Functions in metacoder
Name | Description | |
calc_group_stat | Apply a function to groups of columns | |
DNAbin_to_char | Converts DNAbin to a named character vector | |
add_alpha | add_alpha | |
calc_n_samples | Count the number of samples | |
ambiguous_patterns | Get patterns for ambiguous taxa | |
apply_color_scale | Covert numbers to colors | |
calc_taxon_abund | Sum observation values for each taxon | |
.onAttach | Run when package loads | |
can_be_num | Test if characters can be converted to numbers | |
ends_with | dplyr select_helpers | |
edge_list_depth | Get distance from root of edgelist observations | |
get_taxmap_data | Get a data set from a taxmap object | |
get_taxmap_table | Get a table from a taxmap object | |
get_taxmap_other_cols | Parse the other_cols option | |
map_unique | Run a function on unique values of a iterable | |
get_taxmap_cols | Get a column subset | |
calc_group_median | Calculate medians of groups of columns | |
calc_group_rsd | Relative standard deviations of groups of columns | |
everything | dplyr select_helpers | |
compare_groups | Compare groups of samples | |
check_option_groups | Check option: groups | |
get_class_from_el | Get classification for taxa in edge list | |
diverging_palette | The default diverging color palette | |
hmp_otus | A HMP subset | |
get_optimal_range | Find optimal range | |
line_coords | Makes coordinates for a line | |
get_numerics | Return numeric values in a character | |
hmp_samples | Sample information for HMP subset | |
do_calc_on_num_cols | Run some function to produce new columns. | |
rarefy_obs | Calculate rarefied observation counts | |
write_silva_fasta | Write an imitation of the SILVA FASTA database | |
parse_edge_list | Convert a table with an edge list to taxmap | |
quantative_palette | The default quantative color palette | |
matches | dplyr select_helpers | |
parse_greengenes | Parse Greengenes release | |
check_for_pkg | check for packages | |
check_element_length | Check length of graph attributes | |
look_for_na | Look for NAs in parameters | |
get_edge_children | get_edge_children | |
write_unite_general | Write an imitation of the UNITE general FASTA database | |
heat_tree_matrix | Plot a matrix of heat trees | |
heat_tree | Plot a taxonomic tree | |
metacoder | Metacoder | |
layout_functions | Layout functions | |
limited_print | Print a subset of a character vector | |
complement | Find complement of sequences | |
molten_dist | Get all distances between points | |
get_edge_parents | get_edge_parents | |
inter_circle_gap | Finds the gap/overlap of circle coordinates | |
get_expected_data | Get a data set in as_phyloseq | |
inverse | Generate the inverse of a function | |
contains | dplyr select_helpers | |
parse_rdp | Parse RDP FASTA release | |
parse_dada2 | Convert the output of dada2 to a taxmap object | |
make_dada2_tax_table | Make a imitation of the dada2 taxonomy matrix | |
one_of | dplyr select_helpers | |
parse_summary_seqs | Parse summary.seqs output | |
make_dada2_asv_table | Make a imitation of the dada2 ASV abundance matrix | |
parse_qiime_biom | Parse a BIOM output from QIIME | |
fasta_headers | Get line numbers of FASTA headers | |
parse_silva_fasta | Parse SILVA FASTA release | |
parse_seq_input | Read sequences in an unknown format | |
parse_ubiome | Converts the uBiome file format to taxmap | |
run_primersearch | Execute EMBOSS Primersearch | |
parse_mothur_taxonomy | Parse mothur Classify.seqs *.taxonomy output | |
polygon_coords | Makes coordinates for a regular polygon | |
parse_mothur_tax_summary | Parse mothur *.tax.summary Classify.seqs output | |
repo_url | Return github url | |
parse_unite_general | Parse UNITE general release FASTA | |
filter_ambiguous_taxa | Filter ambiguous taxon names | |
as_phyloseq | Convert taxmap to phyloseq | |
calc_obs_props | Calculate proportions from observation counts | |
calc_group_mean | Calculate means of groups of columns | |
starts_with | dplyr select_helpers | |
convert_base | Converts decimal numbers to other bases | |
counts_to_presence | Apply a function to groups of columns | |
calc_prop_samples | Calculate the proportion of samples | |
scale_bar_coords | Make scale bar division | |
verify_label_count | Verify label count | |
select_labels | Pick labels to show | |
rescale | Rescale numeric vector to have specified minimum and maximum. | |
split_by_level | Splits a taxonomy at a specific level or rank | |
startup_msg | Return startup message | |
verify_trans | Verify transformation function parameters | |
zero_low_counts | Replace low counts with zero | |
write_greengenes | Write an imitation of the Greengenes database | |
get_node_children | get_node_children | |
get_numeric_cols | Get numeric columns from taxmap table | |
get_taxonomy_levels | Get taxonomy levels | |
verify_size | Verify size parameters | |
label_bounds | Bounding box coords for labels | |
is_ambiguous | Find ambiguous taxon names | |
parse_newick | Parse a Newick file | |
my_print | Print something | |
parse_phylo | Parse a phylo object | |
ncbi_sequence | Downloads sequences from ids | |
%>% | magrittr forward-pipe operator | |
primersearch_raw | Use EMBOSS primersearch for in silico PCR | |
make_fasta_with_u_replaced | Make a temporary file U's replaced with T | |
make_plot_legend | Make color/size legend | |
rev_comp | Revere complement sequences | |
qualitative_palette | The default qualitative color palette | |
reverse | Reverse sequences | |
text_grob_length | Estimate text grob length | |
transform_data | Transformation functions | |
verify_size_range | Verify size range parameters | |
verify_taxmap | Check that an object is a taxmap | |
ncbi_taxon_sample | Download representative sequences for a taxon | |
num_range | dplyr select_helpers | |
parse_primersearch | Parse EMBOSS primersearch output | |
parse_phyloseq | Convert a phyloseq to taxmap | |
primersearch_is_installed | Test if primersearch is installed | |
primersearch | Use EMBOSS primersearch for in silico PCR | |
read_fasta | Read a FASTA file | |
read_lines_apply | Apply a function to chunks of a file | |
unique_mapping | get indexes of a unique set of the input | |
write_mothur_taxonomy | Write an imitation of the Mothur taxonomy file | |
verify_color_range | Verify color range parameters | |
write_rdp | Write an imitation of the RDP FASTA database | |
No Results! |
Vignettes of metacoder
Name | ||
introduction.Rmd | ||
No Results! |
Last month downloads
Details
License | GPL-2 | GPL-3 |
LazyData | true |
URL | https://grunwaldlab.github.io/metacoder_documentation/ |
BugReports | https://github.com/grunwaldlab/metacoder/issues |
VignetteBuilder | knitr |
RoxygenNote | 6.1.1 |
Date | 2019-07-17 |
Encoding | UTF-8 |
biocViews | |
LinkingTo | Rcpp |
NeedsCompilation | yes |
Packaged | 2019-07-17 23:56:43 UTC; zachary |
Repository | CRAN |
Date/Publication | 2019-07-18 06:35:33 UTC |
imports | ape , biomformat , cowplot , crayon , dplyr , GA , ggfittext , ggplot2 , ggrepel , grDevices , grid , igraph , lazyeval , magrittr , phylotate , RColorBrewer , Rcpp , RCurl , readr , reshape , reshape2 , rlang , scales , seqinr , stats , stringr , svglite , taxize , tibble , traits , utils , vegan , viridisLite , zoo |
suggests | BiocManager , knitr , phyloseq , rmarkdown , testthat , zlibbioc |
depends | R (>= 3.0.2) , taxa |
Contributors | Niklaus J Grunwald, Rob Gilmore |
Include our badge in your README
[](http://www.rdocumentation.org/packages/metacoder)