Learn R Programming

srvyr (version 0.1.1)

as_survey_twophase: Create a tbl_svy survey object using two phase design

Description

Create a survey object by specifying the survey's two phase design. It is a wrapper around twophase. All survey variables must be included in the data.frame itself. Variables are selected by using bare column names, or convenience functions described in select. as_survey_twophase_ is the standard evaluation counterpart to as_survey_twophase.

Usage

as_survey_twophase(.data, ...)

## S3 method for class 'data.frame': as_survey_twophase(.data, id, strata = NULL, probs = NULL, weights = NULL, fpc = NULL, subset, method = c("full", "approx", "simple"), ...)

## S3 method for class 'twophase2': as_survey_twophase(.data, ...)

as_survey_twophase_(.data, id, strata = NULL, probs = NULL, weights = NULL, fpc = NULL, subset, method = c("full", "approx", "simple"))

Arguments

.data
A data frame (which contains the variables specified below)
...
ignored
id
list of two sets of variable names for sampling unit identifers
strata
list of two sets of variable names (or NULLs) for stratum identifiers
probs
list of two sets of variable names (or NULLs) for sampling probabilities
weights
Only for method = "approx", list of two sets of variable names (or NULLs) for sampling weights
fpc
list of two sets of variables (or NULLs for finite population corrections
subset
bare name of a variable which specifies which observations are selected in phase 2
method
"full" requires (much) more memory, but gives unbiased variance estimates for general multistage designs at both phases. "simple" or "approx" use less memory, and is correect for designs with simple random sampling at phase one and stratifed randoms sampl

Value

  • An object of class tbl_svy

Examples

Run this code
# Examples from ?survey::twophase
# two-phase simple random sampling.
data(pbc, package="survival")

pbc <- pbc %>%
  mutate(randomized = !is.na(trt) & trt > 0,
         id = row_number())
d2pbc <- pbc %>%
  as_survey_twophase(id = list(id, id), subset = randomized)

d2pbc %>% summarize(mean = survey_mean(bili))

# two-stage sampling as two-phase
library(survey)
data(mu284)

mu284_1 <- mu284 %>%
  dplyr::slice(c(1:15, rep(1:5, n2[1:5] - 3))) %>%
  mutate(id = row_number(),
         sub = rep(c(TRUE, FALSE), c(15, 34-15)))

dmu284 <- mu284 %>%
  as_survey_design(ids = c(id1, id2), fpc = c(n1, n2))
# first phase cluster sample, second phase stratified within cluster
d2mu284 <- mu284_1 %>%
  as_survey_twophase(id = list(id1, id), strata = list(NULL, id1),
                  fpc = list(n1, NULL), subset = sub)
dmu284 %>%
  summarize(total = survey_total(y1),
            mean = survey_mean(y1))
d2mu284 %>%
  summarize(total = survey_total(y1),
            mean = survey_mean(y1))

## as_survey_twophase_ uses standard evaluation
id1 <- "id"
id2 <- "id"
d2pbc <- pbc %>%
  as_survey_twophase_(id = list(id1, id2), subset = "randomized")

Run the code above in your browser using DataLab