sample_n_obs: Sample n observations from `taxmap()`

Description

Randomly sample some number of observations from a taxmap() object. Weights can be specified for observations or the taxa they are classified by. Any variable name that appears in all_names() can be used as if it was a vector on its own. See dplyr::sample_n() for the inspiration for this function. Calling the function using the obj$sample_n_obs(...) style edits "obj" in place, unlike most R functions. However, calling the function using the sample_n_obs(obj, ...) imitates R's traditional copy-on-modify semantics, so "obj" would not be changed; instead a changed version would be returned, like most R functions.

obj$sample_n_obs(data, size, replace = FALSE,
  taxon_weight = NULL, obs_weight = NULL,
  use_supertaxa = TRUE, collapse_func = mean, ...)
sample_n_obs(obj, data, size, replace = FALSE,
  taxon_weight = NULL, obs_weight = NULL,
  use_supertaxa = TRUE, collapse_func = mean, ...)

Arguments

obj

(taxmap()) The object to sample from.

data

Dataset names, indexes, or a logical vector that indicates which datasets in obj$data to sample. If multiple datasets are sampled at once, then they must be the same length.

size

(numeric of length 1) The number of observations to sample.

replace

(logical of length 1) If TRUE, sample with replacement.

taxon_weight

(numeric) Non-negative sampling weights of each taxon. If use_supertaxa is TRUE, the weights for each taxon in an observation's classification are supplied to collapse_func to get the observation weight. If obs_weight is also specified, the two weights are multiplied (after taxon_weight for each observation is calculated).

obs_weight

(numeric) Sampling weights of each observation. If taxon_weight is also specified, the two weights are multiplied (after taxon_weight for each observation is calculated).

use_supertaxa

(logical or numeric of length 1) Affects how the taxon_weight is used. If TRUE, the weights for each taxon in an observation's classification are multiplied to get the observation weight. Otherwise, just the taxonomic level the observation is assign to it considered. If TRUE, use all supertaxa. Positive numbers indicate the number of ranks above each taxon to use. 0 is equivalent to FALSE. Negative numbers are equivalent to TRUE.

collapse_func

(function of length 1) If taxon_weight option is used and supertaxa is TRUE, the weights for each taxon in an observation's classification are supplied to collapse_func to get the observation weight. This function should take numeric vector and return a single number.

...

Additional options are passed to filter_obs().

target

DEPRECIATED. use "data" instead.

Value

An object of type taxmap()

Examples

Run this code

# NOT RUN {
# Sample 2 rows without replacement
sample_n_obs(ex_taxmap, "info", 2)
sample_n_obs(ex_taxmap, "foods", 2)

# Sample with replacement
sample_n_obs(ex_taxmap, "info", 10, replace = TRUE)

# Sample some rows for often then others
sample_n_obs(ex_taxmap, "info", 3, obs_weight = n_legs)

# Sample multiple datasets at once
sample_n_obs(ex_taxmap, c("info", "phylopic_ids", "foods"), 3)

# }

Run the code above in your browser using DataLab

Description

Arguments

Value

See Also

Examples