filter_taxa: Filter taxa with a list of conditions

Description

Filter taxa in a taxonomy() or taxmap() object with a series of conditions. Any variable name that appears in all_names() can be used as if it was a vector on its own. See dplyr::filter() for the inspiration for this function and more information. Calling the function using the obj$filter_taxa(...) style edits "obj" in place, unlike most R functions. However, calling the function using the filter_taxa(obj, ...) imitates R's traditional copy-on-modify semantics, so "obj" would not be changed; instead a changed version would be returned, like most R functions.

filter_taxa(obj, ..., subtaxa = FALSE, supertaxa = FALSE,
  drop_obs = TRUE, reassign_obs = TRUE, reassign_taxa = TRUE,
  invert = FALSE, keep_order = TRUE)
obj$filter_taxa(..., subtaxa = FALSE, supertaxa = FALSE,
  drop_obs = TRUE, reassign_obs = TRUE, reassign_taxa = TRUE,
  invert = FALSE, keep_order = TRUE)

Arguments

obj

An object of class taxonomy() or taxmap()

...

One or more filtering conditions. Any variable name that appears in all_names() can be used as if it was a vector on its own. Each filtering condition must resolve to one of three things:

character: One or more taxon IDs contained in obj$edge_list$to
integer: One or more row indexes of obj$edge_list
logical: A TRUE/FALSE vector of length equal to the number of rows in obj$edge_list
NULL: ignored

subtaxa

(logical or numeric of length 1) If TRUE, include subtaxa of taxa passing the filter. Positive numbers indicate the number of ranks below the target taxa to return. 0 is equivalent to FALSE. Negative numbers are equivalent to TRUE.

supertaxa

(logical or numeric of length 1) If TRUE, include supertaxa of taxa passing the filter. Positive numbers indicate the number of ranks above the target taxa to return. 0 is equivalent to FALSE. Negative numbers are equivalent to TRUE.

drop_obs

(logical) This option only applies to taxmap() objects. If FALSE, include observations (i.e. user-defined data in obj$data) even if the taxon they are assigned to is filtered out. Observations assigned to removed taxa will be assigned to NA. This option can be either simply TRUE/FALSE, meaning that all data sets will be treated the same, or a logical vector can be supplied with names corresponding one or more data sets in obj$data. For example, c(abundance = FALSE, stats = TRUE) would include observations whose taxon was filtered out in obj$data$abundance, but not in obj$data$stats. See the reassign_obs option below for further complications.

reassign_obs

(logical of length 1) This option only applies to taxmap() objects. If TRUE, observations (i.e. user-defined data in obj$data) assigned to removed taxa will be reassigned to the closest supertaxon that passed the filter. If there are no supertaxa of such an observation that passed the filter, they will be filtered out if drop_obs is TRUE. This option can be either simply TRUE/FALSE, meaning that all data sets will be treated the same, or a logical vector can be supplied with names corresponding one or more data sets in obj$data. For example, c(abundance = TRUE, stats = FALSE) would reassign observations in obj$data$abundance, but not in obj$data$stats.

reassign_taxa

(logical of length 1) If TRUE, subtaxa of removed taxa will be reassigned to the closest supertaxon that passed the filter. This is useful for removing intermediate levels of a taxonomy.

invert

(logical of length 1) If TRUE, do NOT include the selection. This is different than just replacing a == with a != because this option negates the selection after taking into account the subtaxa and supertaxa options. This is useful for removing a taxon and all its subtaxa for example.

keep_order

(logical of length 1) If TRUE, keep relative order of taxa not filtered out. For example, the result of filter_taxa(ex_taxmap, 1:3) and filter_taxa(ex_taxmap, 3:1) would be the same. Does not affect dataset order, only taxon order. This is useful for maintaining order correspondence with a dataset that has one value per taxon.

Value

An object of type taxonomy() or taxmap()

Examples

Run this code

# NOT RUN {
# Filter by index
filter_taxa(ex_taxmap, 1:3)

# Filter by taxon ID
filter_taxa(ex_taxmap, c("b", "c", "d"))

# Fiter by TRUE/FALSE
filter_taxa(ex_taxmap, taxon_names == "Plantae", subtaxa = TRUE)
filter_taxa(ex_taxmap, n_obs > 3)
filter_taxa(ex_taxmap, ! taxon_ranks %in% c("species", "genus"))
filter_taxa(ex_taxmap, taxon_ranks == "genus", n_obs > 1)

# Filter by an observation characteristic
dangerous_taxa <- sapply(ex_taxmap$obs("info"),
                         function(i) any(ex_taxmap$data$info$dangerous[i]))
filter_taxa(ex_taxmap, dangerous_taxa)

# Include supertaxa
filter_taxa(ex_taxmap, 12, supertaxa = TRUE)
filter_taxa(ex_taxmap, 12, supertaxa = 2)

# Include subtaxa
filter_taxa(ex_taxmap, 1, subtaxa = TRUE)
filter_taxa(ex_taxmap, 1, subtaxa = 2)

# Dont remove rows in user-defined data corresponding to removed taxa
filter_taxa(ex_taxmap, 2, drop_obs = FALSE)
filter_taxa(ex_taxmap, 2, drop_obs = c(info = FALSE))

# Remove a taxon and it subtaxa
filter_taxa(ex_taxmap, taxon_names == "Mammalia",
            subtaxa = TRUE, invert = TRUE)

# }

Run the code above in your browser using DataLab

Description

Arguments

Value

See Also

Examples