zero_low_counts: Replace low counts with zero

Description

For a given table in a taxmap object, convert all counts below a minimum number to zero. This is useful for effectively removing "singletons", "doubletons", or other low abundance counts.

Usage

zero_low_counts(
  obj,
  data,
  min_count = 2,
  use_total = FALSE,
  cols = NULL,
  other_cols = FALSE,
  out_names = NULL,
  dataset = NULL
)

Value

A tibble

Arguments

obj

A taxmap object

data

The name of a table in obj$data.

min_count

The minimum number of counts needed for a count to remain unchanged. Any could less than this will be converted to a zero. For example, min_count = 2 would remove singletons.

use_total

If TRUE, the min_count applies to the total count for each row (e.g. OTU counts for all samples), rather than each cell in the table. For example use_total = TRUE, min_count = 10 would convert all counts of any row to zero if the total for all counts in that row was less than 10.

cols

The columns in data to use. By default, all numeric columns are used. Takes one of the following inputs:

TRUE/FALSE:: All/No columns will used.

Character vector:

The names of columns to use

Numeric vector:

The indexes of columns to use

Vector of TRUE/FALSE of length equal to the number of columns:

Use the columns corresponding to TRUE values.

other_cols

Preserve in the output non-target columns present in the input data. New columns will always be on the end. The "taxon_id" column will be preserved in the front. Takes one of the following inputs:

NULL:: No columns will be added back, not even the taxon id column.

TRUE/FALSE:

All/None of the non-target columns will be preserved.

Character vector:

The names of columns to preserve

Numeric vector:

The indexes of columns to preserve

Vector of TRUE/FALSE of length equal to the number of columns:

Preserve the columns corresponding to TRUE values.

out_names

The names of count columns in the output. Must be the same length and order as cols (or unique(groups), if groups is used).

dataset

DEPRECIATED. use "data" instead.

Examples

Run this code

if (FALSE) {
# Parse data for examples
x = parse_tax_data(hmp_otus, class_cols = "lineage", class_sep = ";",
                   class_key = c(tax_rank = "taxon_rank", tax_name = "taxon_name"),
                   class_regex = "^(.+)__(.+)$")
                   
# Default use
zero_low_counts(x, "tax_data")

# Use only a subset of columns
zero_low_counts(x, "tax_data", cols = c("700035949", "700097855", "700100489"))
zero_low_counts(x, "tax_data", cols = 4:6)
zero_low_counts(x, "tax_data", cols = startsWith(colnames(x$data$tax_data), "70001"))

# Including all other columns in ouput
zero_low_counts(x, "tax_data", other_cols = TRUE)

# Inlcuding specific columns in output
zero_low_counts(x, "tax_data", cols = c("700035949", "700097855", "700100489"),
                other_cols = 2:3)
               
# Rename output columns
zero_low_counts(x, "tax_data", cols = c("700035949", "700097855", "700100489"),
                out_names = c("a", "b", "c"))

}

Run the code above in your browser using DataLab

Description

Usage

Value

Arguments

See Also

Examples