Filter taxa in a taxonomy()
or taxmap()
object with a series of
conditions. Any variable name that appears in all_names()
can be used
as if it was a vector on its own. See dplyr::filter()
for the inspiration
for this function and more information. Calling the function using the
obj$filter_taxa(...)
style edits "obj" in place, unlike most R functions.
However, calling the function using the filter_taxa(obj, ...)
imitates R's
traditional copy-on-modify semantics, so "obj" would not be changed; instead
a changed version would be returned, like most R functions.
filter_taxa(obj, ..., subtaxa = FALSE, supertaxa = FALSE, drop_obs = TRUE, reassign_obs = TRUE, reassign_taxa = TRUE, invert = FALSE, keep_order = TRUE) obj$filter_taxa(..., subtaxa = FALSE, supertaxa = FALSE, drop_obs = TRUE, reassign_obs = TRUE, reassign_taxa = TRUE, invert = FALSE, keep_order = TRUE)
An object of class taxonomy()
or taxmap()
One or more filtering conditions. Any variable name that appears
in all_names()
can be used as if it was a vector on its own. Each
filtering condition must resolve to one of three things:
character
: One or more taxon IDs contained in obj$edge_list$to
integer
: One or more row indexes of obj$edge_list
logical
: A TRUE
/FALSE
vector of length equal to the number of rows
in obj$edge_list
NULL
: ignored
(logical
or numeric
of length 1) If TRUE
, include
subtaxa of taxa passing the filter. Positive numbers indicate the number of
ranks below the target taxa to return. 0
is equivalent to FALSE
.
Negative numbers are equivalent to TRUE
.
(logical
or numeric
of length 1) If TRUE
, include
supertaxa of taxa passing the filter. Positive numbers indicate the number
of ranks above the target taxa to return. 0
is equivalent to FALSE
.
Negative numbers are equivalent to TRUE
.
(logical
) This option only applies to taxmap()
objects.
If FALSE
, include observations (i.e. user-defined data in obj$data
)
even if the taxon they are assigned to is filtered out. Observations
assigned to removed taxa will be assigned to NA
. This option can be
either simply TRUE
/FALSE
, meaning that all data sets will be treated
the same, or a logical vector can be supplied with names corresponding one
or more data sets in obj$data
. For example, c(abundance = FALSE, stats = TRUE)
would include observations whose taxon was filtered out in
obj$data$abundance
, but not in obj$data$stats
. See the reassign_obs
option below for further complications.
(logical
of length 1) This option only applies to
taxmap()
objects. If TRUE
, observations (i.e. user-defined data in
obj$data
) assigned to removed taxa will be reassigned to the closest
supertaxon that passed the filter. If there are no supertaxa of such an
observation that passed the filter, they will be filtered out if drop_obs
is TRUE
. This option can be either simply TRUE
/FALSE
, meaning that
all data sets will be treated the same, or a logical vector can be supplied
with names corresponding one or more data sets in obj$data
. For example,
c(abundance = TRUE, stats = FALSE)
would reassign observations in
obj$data$abundance
, but not in obj$data$stats
.
(logical
of length 1) If TRUE
, subtaxa of removed
taxa will be reassigned to the closest supertaxon that passed the filter.
This is useful for removing intermediate levels of a taxonomy.
(logical
of length 1) If TRUE
, do NOT include the
selection. This is different than just replacing a ==
with a !=
because
this option negates the selection after taking into account the subtaxa
and supertaxa
options. This is useful for removing a taxon and all its
subtaxa for example.
(logical
of length 1) If TRUE
, keep relative order of
taxa not filtered out. For example, the result of filter_taxa(ex_taxmap, 1:3)
and filter_taxa(ex_taxmap, 3:1)
would be the same. Does not affect
dataset order, only taxon order. This is useful for maintaining order
correspondence with a dataset that has one value per taxon.
An object of type taxonomy()
or taxmap()
Other taxmap manipulation functions:
arrange_obs()
,
arrange_taxa()
,
filter_obs()
,
mutate_obs()
,
sample_frac_obs()
,
sample_frac_taxa()
,
sample_n_obs()
,
sample_n_taxa()
,
select_obs()
,
transmute_obs()
# NOT RUN {
# Filter by index
filter_taxa(ex_taxmap, 1:3)
# Filter by taxon ID
filter_taxa(ex_taxmap, c("b", "c", "d"))
# Fiter by TRUE/FALSE
filter_taxa(ex_taxmap, taxon_names == "Plantae", subtaxa = TRUE)
filter_taxa(ex_taxmap, n_obs > 3)
filter_taxa(ex_taxmap, ! taxon_ranks %in% c("species", "genus"))
filter_taxa(ex_taxmap, taxon_ranks == "genus", n_obs > 1)
# Filter by an observation characteristic
dangerous_taxa <- sapply(ex_taxmap$obs("info"),
function(i) any(ex_taxmap$data$info$dangerous[i]))
filter_taxa(ex_taxmap, dangerous_taxa)
# Include supertaxa
filter_taxa(ex_taxmap, 12, supertaxa = TRUE)
filter_taxa(ex_taxmap, 12, supertaxa = 2)
# Include subtaxa
filter_taxa(ex_taxmap, 1, subtaxa = TRUE)
filter_taxa(ex_taxmap, 1, subtaxa = 2)
# Dont remove rows in user-defined data corresponding to removed taxa
filter_taxa(ex_taxmap, 2, drop_obs = FALSE)
filter_taxa(ex_taxmap, 2, drop_obs = c(info = FALSE))
# Remove a taxon and it subtaxa
filter_taxa(ex_taxmap, taxon_names == "Mammalia",
subtaxa = TRUE, invert = TRUE)
# }
Run the code above in your browser using DataLab