Learn R Programming

capm (version 0.3)

DesignSurvey: Two-stage cluster sampling and systematic sampling analysis

Description

A wraper for svydesign function from the survey package, to specify a two-stage cluster sampling analysis or a systematic sampling. In the first case, weights are calculated considering a probability proportional to size sampling with replacement for the first stage and a simple random sampling for the second stage. Finite population correction is specified as the population size for each level of sampling.

Usage

DesignSurvey(psu.ssu = NULL, sample = NULL, psu.col = NULL,
  ssu.col = NULL, design = "2clusterPPS", psu.2cd = NULL, total = NULL,
  ...)

Arguments

psu.ssu
data.frame with all Primary Sampling Units (PSU). First column contains PSU unique identifiers. Second column contains numeric PSU sizes.
sample
data.frame with sample observations. One of the columns must contain unique identifiers for PSU. Another column must contain unique identifiers for Secondary Sampling Units (SSU). The rest of the
psu.col
the column of sample containing the psu identifiers (for two-stage cluster designs).
ssu.col
the column of sample containing the ssu identifiers (for two-stage cluster designs).
design
string to define the type of design. "2clusterPPS" defines a two-stage cluster sampling design with selection of PSU with probability proportional to size. "simple" defines a simple (systematic) random sampling design.
psu.2cd
value indicating the number of psu included in a design of type "2clusterPPS" (for psu included more than once, each must be counted).
total
numeric value representing the total of sampling units in the population. If design is equal to "2clusterPPS", it is not necessary to define total.
...
further arguments passed to svydesign function.

Details

A PSU appearing in both psu.ssu and in sample must have the same identifier. SSU identifiers must be unique but can appear more than once if there is more than one observation per SSU.

References

Lumley, T. (2011). Complex surveys: A guide to analysis using R (Vol. 565). Wiley.

http://oswaldosantos.github.io/capm

Examples

Run this code
# Load data with PSU identifiers and sizes.
data(psu.ssu)

# Load data with sample data.
data(survey.data)

# Specify the two-stage cluster design that included 20 PSU.
(design <- DesignSurvey(psu.ssu, survey.data, psu.col = 2, ssu.col = 1, psu.2cd = 20))

Run the code above in your browser using DataLab