Filter out taxa with ambiguous names, such as "unknown" or "uncultured".
NOTE: some parameters of this function are passed to
filter_taxa
with the "invert" option set to TRUE
.
Works the same way as filter_taxa
for the most part.
filter_ambiguous_taxa(obj, unknown = TRUE, uncultured = TRUE,
name_regex = ".", ignore_case = TRUE, subtaxa = FALSE,
drop_obs = TRUE, reassign_obs = TRUE, reassign_taxa = TRUE)
A taxmap
object
If TRUE
, Remove taxa with names the suggest they are
placeholders for unknown taxa (e.g. "unknown ...").
If TRUE
, Remove taxa with names the suggest they are
assigned to uncultured organisms (e.g. "uncultured ...").
The regex code to match a valid character in a taxon name. For example, "[a-z]" would mean taxon names can only be lower case letters.
If TRUE
, dont consider the case of the text when
determining a match.
(logical
or numeric
of length 1) If TRUE
, include
subtaxa of taxa passing the filter. Positive numbers indicate the number of
ranks below the target taxa to return. 0
is equivalent to FALSE
.
Negative numbers are equivalent to TRUE
.
(logical
) This option only applies to taxmap()
objects.
If FALSE
, include observations even if the taxon they are assigned to is
filtered out. Observations assigned to removed taxa will be assigned to
NA
. This option can be either simply TRUE
/FALSE
, meaning that
all data sets will be treated the same, or a logical vector can be supplied
with names corresponding one or more data sets in obj$data
. For example,
c(abundance = FALSE, stats = TRUE)
would include observations whose taxon
was filtered out in obj$data$abundance
, but not in obj$data$stats
. See
the reassign_obs
option below for further complications.
(logical
of length 1) This option only applies to
taxmap()
objects. If TRUE
, observations assigned to removed taxa will
be reassigned to the closest supertaxon that passed the filter. If there
are no supertaxa of such an observation that passed the filter, they will
be filtered out if drop_obs
is TRUE
. This option can be either simply
TRUE
/FALSE
, meaning that all data sets will be treated the same, or a
logical vector can be supplied with names corresponding one or more data
sets in obj$data
. For example, c(abundance = TRUE, stats = FALSE)
would
reassign observations in obj$data$abundance
, but not in obj$data$stats
.
(logical
of length 1) If TRUE
, subtaxa of removed
taxa will be reassigned to the closest supertaxon that passed the filter.
This is useful for removing intermediate levels of a taxonomy.
A taxmap
object
If you encounter a taxon name that represents an ambiguous taxon that is not filtered out by this function, let us know and we will add it.