
Declare sampling procedure
declare_sampling(..., handler = sampling_handler, label = NULL)sampling_handler(data, ..., sampling_variable = "S")
arguments to be captured, and later passed to the handler
a tidy-in, tidy-out function
a string describing the step
A data.frame.
The prefix for the sampling inclusion probability variable.
A function that takes a data.frame as an argument and returns a data.frame subsetted to sampled observations and (optionally) augmented with inclusion probabilities and other quantities.
declare_sampling
can work with any sampling_function that takes data and returns data. The default handler is draw_rs
from the randomizr
package. This allows quick declaration of many sampling schemes that involve strata and clusters.
The arguments to draw_rs
can include N, strata_var, clust_var, n, prob, strata_n, and strata_prob.
The arguments you need to specify are different for different designs.
Note that declare_sampling
works similarly to declare_assignment
a key difference being that declare_sampling
functions subset data to sampled units rather than simply appending an indicator for membership of a sample (assignment). If you need to sample but keep the dataset use declare_assignment
and define further steps (such as estimation) with respect to subsets defined by the assignment.
For details see the help files for complete_rs
, strata_rs
, cluster_rs
, or strata_and_cluster_rs
# NOT RUN {
# Default handler is `draw_rs` from `randomizr` package
# Simple random sampling
my_sampling <- declare_sampling(n = 50)
# Stratified random sampling
my_stratified_sampling <- declare_sampling(strata = female)
# Custom random sampling functions
my_sampling_function <- function(data, n=nrow(data)) {
data[sample(n,n,replace=TRUE), , drop=FALSE]
}
my_sampling_custom <- declare_sampling(handler = my_sampling_function)
my_sampling_custom(sleep)
# }
Run the code above in your browser using DataLab