Recursivly sample a set of observations with a heirarchical classification. This function takes other functions as arguments and is intended to be used to make other more user-friendly functions.
recursive_sample(root_id, get_obs, get_subtaxa, get_rank = NULL,
cat_obs = unlist, max_counts = c(), min_counts = c(),
max_children = c(), min_children = c(), obs_filters = list(),
subtaxa_filters = list(), stop_conditions = list(), ...)
(character
of length 1) The taxon to sample. By default, the root of the
taxonomy used.
(function(character)
) A function that returns the observations assigned to the
a given taxon. The function's first argument should be the taxon id and it should return a data
structure possibly representing multiple observations.
(function(character)
) A function that returns the sub taxa for a given
taxon. The function's first argument should be the taxon id and it should return a vector of
taxon IDs.
(function(character)
) A function that returns the rank of a given taxon
id. The function's first argument should be the taxon id and it should return the rank of that
taxon.
(function(list)
) A function that takes a list of whatever is returned by
get_obs
and concatenates them into a single data structure of the type returned by
get_obs
.
(numeric
) A named vector that defines that maximum number of
observations in for each level specified. The names of the vector specifies that level each number
applies to. If more than the maximum number of observations exist for a given taxon, it is randomly
subsampled to this number.
(numeric
) A named vector that defines that minimum number of
observations in for each level specified. The names of the vector specifies that level each number
applies to.
(numeric
) A named vector that defines that maximum number of
subtaxa per taxon for each level specified. The names of the vector specifies that level each
number applies to. If more than the maximum number of subtaxa exist for a given taxon, they
are randomly subsampled to this number of subtaxa.
(numeric
) A named vector that defines that minimum number of
subtaxa in for each level specified. The names of the vector specifies that level each number
applies to.
(list
of function(observations, id)
) A list of functions that take a data
structure containing the information of multiple observations and a taxon id.
Returns a object of the same type with some of the observations potentially removed.
(list
of function(observations, id)
) A list of functions that take a data
structure containing the information of multiple subtaxa IDs and the current taxon id.
Returns a object of the same type with some of the subtaxa potentially removed. If a function returns
NULL
, then no observations for the current taxon are returned.
(list
of function(id)
) A list of functions that take the
current taxon id. If any of the functions return TRUE
, the observations for the current taxon are
returned rather than looking for observations of subtaxa, stopping the recursion.
Additional parameters are passed to all of the function options.