metacoder v0.3.3

0

Monthly downloads

0th

Percentile

Tools for Parsing, Manipulating, and Graphing Taxonomic Abundance Data

A set of tools for parsing, manipulating, and graphing data classified by a hierarchy (e.g. a taxonomy).

Readme

Build Status codecov.io Downloads from Rstudio mirror per month Downloads from Rstudio mirror CRAN version

An R package for metabarcoding research planning and analysis

Metacoder is an R package for reading, plotting, and manipulating large taxonomic data sets, like those generated from modern high-throughput sequencing, like metabarcoding (i.e. amplification metagenomics, 16S metagenomics, etc). It provides a tree-based visualization called "heat trees" used to depict statistics for every taxon in a taxonomy using color and size. It also provides various functions to do common tasks in microbiome bioinformatics on data in the taxmap format defined by the taxa package, such as:

  • Summing read counts/abundance per taxon
  • Converting counts to proportions and rarefaction of counts using vegan
  • Comparing the abundance (or other characteristics) of groups of samples (e.g., experimental treatments) per taxon
  • Combining data for groups of samples
  • Simulated PCR, via EMBOSS primersearch, for testing primer specificity and coverage of taxonomic groups
  • Converting common microbiome formats for data and reference databases into the objects defined by the taxa package.
  • Converting to and from the phyloseq format and the taxa format

Installation

This project is available on CRAN and can be installed like so:

install.packages("metacoder")

You can also install the development version for the newest features, bugs, and bug fixes:

install.packages("devtools")
devtools::install_github("grunwaldlab/metacoder")

Documentation

All the documentation for metacoder can be found on our website here:

https://grunwaldlab.github.io/metacoder_documentation/

Dependencies

The function that simulates PCR requires primersearch from the EMBOSS tool kit to be installed. This is not an R package, so it is not automatically installed. Type ?primersearch after installing and loading metacoder for installation instructions.

Relationship with other packages

Many of these operations can be done using other packages like phyloseq, which also provides tools for diversity analysis. The main strength of metacoder is that its functions use the flexible data types defined by taxa, which has powerful parsing and subsetting abilities that take into account the hierarchical relationship between taxa and user-defined data. In general, metacoder and taxa are more of an abstracted tool kit, whereas phyloseq has more specialized functions for community diversity data, but they both can do similar things. I encourage you to try both to see which fits your needs and style best. You can also combine the two in a single analysis by converting between the two data types when needed.

Citation

If you use metcoder in a publication, please cite our article in PLOS Computational Biology:

Foster ZSL, Sharpton TJ, Grünwald NJ (2017) Metacoder: An R package for visualization and manipulation of community taxonomic diversity data. PLOS Computational Biology 13(2): e1005404. https://doi.org/10.1371/journal.pcbi.1005404

Future development

Metacoder is under active development and many new features are planned. Some improvements that are being explored include:

  • Barcoding gap analysis and associated plotting functions
  • A function to aid in retrieving appropriate sequence data from NCBI for in silico PCR from whole genome sequences
  • Graphing of different node shapes in heat trees, possibly including pie graphs or PhyloPics.
  • Adding the ability to plot specific edge lengths in the heat trees so they can be used for phylogenetic trees.
  • Adding more data import and export functions to make parsing and writing common formats easier.

To see the details of what is being worked on, check out the issues tab of the Metacoder Github site.

License

This work is subject to the MIT License.

Acknowledgements

Metacoder's major dependencies are taxa, taxize, vegan, igraph, dplyr, and ggplot2.

This package includes code from the R package ggrepel to handle label overlap avoidance with permission from the author of ggrepel Kamil Slowikowski. We included the code instead of depending on ggrepel because we are using functions internal to ggrepel that might change in the future. We thank Kamil Slowikowski for letting us use his code and would like to acknowledge his implementation of the label overlap avoidance used in metacoder.

Feedback and contributions

We would like to hear about users' thoughts on the package and any errors they run into. Please report errors, questions or suggestions on the issues tab of the Metacoder Github site. We also welcome contributions via a Github pull request. You can also talk with us using our Google groups site.

Functions in metacoder

Name Description
calc_group_stat Apply a function to groups of columns
DNAbin_to_char Converts DNAbin to a named character vector
add_alpha add_alpha
calc_n_samples Count the number of samples
ambiguous_patterns Get patterns for ambiguous taxa
apply_color_scale Covert numbers to colors
calc_taxon_abund Sum observation values for each taxon
.onAttach Run when package loads
can_be_num Test if characters can be converted to numbers
ends_with dplyr select_helpers
edge_list_depth Get distance from root of edgelist observations
get_taxmap_data Get a data set from a taxmap object
get_taxmap_table Get a table from a taxmap object
get_taxmap_other_cols Parse the other_cols option
map_unique Run a function on unique values of a iterable
get_taxmap_cols Get a column subset
calc_group_median Calculate medians of groups of columns
calc_group_rsd Relative standard deviations of groups of columns
everything dplyr select_helpers
compare_groups Compare groups of samples
check_option_groups Check option: groups
get_class_from_el Get classification for taxa in edge list
diverging_palette The default diverging color palette
hmp_otus A HMP subset
get_optimal_range Find optimal range
line_coords Makes coordinates for a line
get_numerics Return numeric values in a character
hmp_samples Sample information for HMP subset
do_calc_on_num_cols Run some function to produce new columns.
rarefy_obs Calculate rarefied observation counts
write_silva_fasta Write an imitation of the SILVA FASTA database
parse_edge_list Convert a table with an edge list to taxmap
quantative_palette The default quantative color palette
matches dplyr select_helpers
parse_greengenes Parse Greengenes release
check_for_pkg check for packages
check_element_length Check length of graph attributes
look_for_na Look for NAs in parameters
get_edge_children get_edge_children
write_unite_general Write an imitation of the UNITE general FASTA database
heat_tree_matrix Plot a matrix of heat trees
heat_tree Plot a taxonomic tree
metacoder Metacoder
layout_functions Layout functions
limited_print Print a subset of a character vector
complement Find complement of sequences
molten_dist Get all distances between points
get_edge_parents get_edge_parents
inter_circle_gap Finds the gap/overlap of circle coordinates
get_expected_data Get a data set in as_phyloseq
inverse Generate the inverse of a function
contains dplyr select_helpers
parse_rdp Parse RDP FASTA release
parse_dada2 Convert the output of dada2 to a taxmap object
make_dada2_tax_table Make a imitation of the dada2 taxonomy matrix
one_of dplyr select_helpers
parse_summary_seqs Parse summary.seqs output
make_dada2_asv_table Make a imitation of the dada2 ASV abundance matrix
parse_qiime_biom Parse a BIOM output from QIIME
fasta_headers Get line numbers of FASTA headers
parse_silva_fasta Parse SILVA FASTA release
parse_seq_input Read sequences in an unknown format
parse_ubiome Converts the uBiome file format to taxmap
run_primersearch Execute EMBOSS Primersearch
parse_mothur_taxonomy Parse mothur Classify.seqs *.taxonomy output
polygon_coords Makes coordinates for a regular polygon
parse_mothur_tax_summary Parse mothur *.tax.summary Classify.seqs output
repo_url Return github url
parse_unite_general Parse UNITE general release FASTA
filter_ambiguous_taxa Filter ambiguous taxon names
as_phyloseq Convert taxmap to phyloseq
calc_obs_props Calculate proportions from observation counts
calc_group_mean Calculate means of groups of columns
starts_with dplyr select_helpers
convert_base Converts decimal numbers to other bases
counts_to_presence Apply a function to groups of columns
calc_prop_samples Calculate the proportion of samples
scale_bar_coords Make scale bar division
verify_label_count Verify label count
select_labels Pick labels to show
rescale Rescale numeric vector to have specified minimum and maximum.
split_by_level Splits a taxonomy at a specific level or rank
startup_msg Return startup message
verify_trans Verify transformation function parameters
zero_low_counts Replace low counts with zero
write_greengenes Write an imitation of the Greengenes database
get_node_children get_node_children
get_numeric_cols Get numeric columns from taxmap table
get_taxonomy_levels Get taxonomy levels
verify_size Verify size parameters
label_bounds Bounding box coords for labels
is_ambiguous Find ambiguous taxon names
parse_newick Parse a Newick file
my_print Print something
parse_phylo Parse a phylo object
ncbi_sequence Downloads sequences from ids
%>% magrittr forward-pipe operator
primersearch_raw Use EMBOSS primersearch for in silico PCR
make_fasta_with_u_replaced Make a temporary file U's replaced with T
make_plot_legend Make color/size legend
rev_comp Revere complement sequences
qualitative_palette The default qualitative color palette
reverse Reverse sequences
text_grob_length Estimate text grob length
transform_data Transformation functions
verify_size_range Verify size range parameters
verify_taxmap Check that an object is a taxmap
ncbi_taxon_sample Download representative sequences for a taxon
num_range dplyr select_helpers
parse_primersearch Parse EMBOSS primersearch output
parse_phyloseq Convert a phyloseq to taxmap
primersearch_is_installed Test if primersearch is installed
primersearch Use EMBOSS primersearch for in silico PCR
read_fasta Read a FASTA file
read_lines_apply Apply a function to chunks of a file
unique_mapping get indexes of a unique set of the input
write_mothur_taxonomy Write an imitation of the Mothur taxonomy file
verify_color_range Verify color range parameters
write_rdp Write an imitation of the RDP FASTA database
No Results!

Vignettes of metacoder

Name
introduction.Rmd
No Results!

Last month downloads

Details

License GPL-2 | GPL-3
LazyData true
URL https://grunwaldlab.github.io/metacoder_documentation/
BugReports https://github.com/grunwaldlab/metacoder/issues
VignetteBuilder knitr
RoxygenNote 6.1.1
Date 2019-07-17
Encoding UTF-8
biocViews
LinkingTo Rcpp
NeedsCompilation yes
Packaged 2019-07-17 23:56:43 UTC; zachary
Repository CRAN
Date/Publication 2019-07-18 06:35:33 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/metacoder)](http://www.rdocumentation.org/packages/metacoder)