DescTools (version 0.99.14)

Strata: Stratified sampling

Description

Stratified sampling with equal/unequal probabilities.

Usage

Strata(x, stratanames = NULL, size, 
       method = c("srswor", "srswr", "poisson", "systematic"), 
       pik, description = FALSE)

Arguments

x
a data frame or a matrix; its number of rows is n, the population size.
stratanames
vector of stratification variables.
size
vector of stratum sample sizes (in the order in which the strata are given in the input data set).
method
method to select units; implemented are: a) simple random sampling without replacement ("srswor"), b) simple random sampling with replacement ("srswr"), c) Poisson sampling ("poisson"), d) systematic sampling (
pik
vector of inclusion probabilities or auxiliary information used to compute them; this argument is only used for unequal probability sampling (Poisson and systematic). If an auxiliary information is provided, the function uses the inclusionprobabilities
description
a message is printed if its value is TRUE; the message gives the number of selected units and the number of the units in the population. By default, the value is FALSE.

Value

  • The function produces an object, which contains the following information:
  • idthe identifier of the selected units.
  • stratumthe unit stratum.
  • probthe final unit inclusion probability.

See Also

sample

Examples

Run this code
# Example from An and Watts (New SAS procedures for Analysis of Sample Survey Data)
# generates artificial data (a 235X3 matrix with 3 columns: state, region, income).
# the variable "state" has 2 categories ('nc' and 'sc'). 
# the variable "region" has 3 categories (1, 2 and 3).
# the sampling frame is stratified by region within state.
# the income variable is randomly generated

m <- rbind(matrix(rep("nc",165), 165, 1, byrow=TRUE), 
           matrix(rep("sc", 70), 70, 1, byrow=TRUE))
m <- cbind.data.frame(m, c(rep(1, 100), rep(2,50), rep(3,15), 
                      rep(1, 30), rep(2, 40)), 1000 * runif(235))
names(m) <- c("state", "region", "income")

# computes the population stratum sizes
table(m$region, m$state)

# not run
#     nc  sc
#  1 100  30
#  2  50  40
#  3  15   0
# there are 5 cells with non-zero values
# one draws 5 samples (1 sample in each stratum)
# the sample stratum sizes are 10,5,10,4,6, respectively
# the method is 'srswor' (equal probability, without replacement)

s <- Strata(m, c("region", "state"), size=c(10, 5, 10, 4, 6), method="srswor")

# extracts the observed data
data.frame(income=m[s$id, "income"], s)

# see the result using a contigency table
table(s$region, s$state)


# The same data as in Example 1
# the method is 'systematic' (unequal probability, without replacement)
# the selection probabilities are computed using the variable 'income'
s <- Strata(m,c("region", "state"), size=c(10, 5, 10, 4, 6), 
            method="systematic", pik=m$income)

# extracts the observed data
data.frame(income=m[s$id, "income"], s)

# see the result using a contigency table
table(s$region, s$state)

Run the code above in your browser using DataCamp Workspace