Top_taxa: Calculate top taxa and others

Description

Top taxa is widely used in data analysis,here we provide a simple function to calculate which simplify your R script.

Usage

Top_taxa(input, n, inputformat, outformat)

Value

Data frame with top n taxa

Arguments

input

Reads or relative abundance(recommended) of OTU/Taxa/gene data frame,see details in inputformat

n

Top n taxa remained according to relative abundance

inputformat

1:data frame with first column of OTUID and last column of taxonomy

2:data frame with first column of OTUID/taxonomy (recommended!!!)

3:data frame of all numeric,with row names of OTUID/taxonomy

outformat

return outformat the same as inputformat
return data frame of all numeric with OTU/gene/taxa ID in row names(not available for inputformat 1).

Author

Wang Ningqi2434066068@qq.com

Examples

Run this code

### Data preparation ####
data(testotu)
require(tidyr); require(magrittr)  ## Or use pipe command in "dplyr"

testotu.pct <- data.frame(
  OTU.ID = testotu[, 1],
  sweep(testotu[, -c(1, 22)], 2, colSums(testotu[, -c(1, 22)]), "/"),
  taxonomy = testotu[, 22]
)

sep_testotu <- Filter_function(
  input = testotu,
  threshold = 0.0001,
  format = 1
) %>%
  separate(
    ., col = taxonomy,
    into = c("Domain", "Phylum", "Order", "Family", "Class", "Genus", "Species"),
    sep = ";"
  )

phylum <- aggregate(
  sep_testotu[, 2:21], by = list(sep_testotu$Phylum), FUN = sum
)

phylum1 <- data.frame(row.names = phylum[, 1], phylum[, -1])

##### Input format 1, top 100 OTU #####
top100otu <- Top_taxa(
  input = testotu.pct,
  n = 100,
  inputformat = 1,
  outformat = 1
)

##### Input format 2, top 15 phylum #####
head(phylum)
top15phylum <- Top_taxa(
  input = phylum,
  n = 15,
  inputformat = 2,
  outformat = 1
)

##### Input format 3, top 15 phylum #####
head(phylum1)
top15phylum <- Top_taxa(
  input = phylum1,
  n = 15,
  inputformat = 3,
  outformat = 1
)

Run the code above in your browser using DataLab