Learn R Programming

stratallo (version 3.0.1)

opt: Optimum Sample Allocation in Stratified Sampling

Description

[Stable]

Computes the optimum allocation for the following optimum allocation problem, formulated in mathematical optimization terms:

Minimize $$f(x_1,\ldots,x_H) = \sum_{h=1}^H \frac{A^2_h}{x_h}$$ over \(\mathbb R_+^H\), subject to $$\sum_{h=1}^H x_h = n,$$ $$m_h \leq x_h \leq M_h, \qquad h = 1,\ldots,H,$$ where \(n > 0,\, A_h > 0,\, m_h > 0,\, M_h > 0\), such that \(m_h < M_h,\, h = 1,\ldots,H\), and \(\sum_{h=1}^H m_h \leq n \leq \sum_{h=1}^H M_h\), are given numbers. Inequality constraints are optional and may be omitted.

Inequality constraints are optional, and the user can choose whether and how they are applied to the optimization problem. This is controlled using the m and M arguments as follows:

  • No inequality constraints: both m and M must be NULL (default).

  • Lower bounds only (\(m_1,\, \ldots,\, m_H\)): specify m, and set M = NULL.

  • Upper bounds only (\(M_1,\, \ldots,\, M_H\)): specify M, and set m = NULL.

  • Box constraints (\(m_h, M_h,\, h = 1,\ldots,H\)): specify both m and M.

Usage

opt(n, A, m = NULL, M = NULL, M_algorithm = "rna")

Value

A numeric vector of the optimal sample allocations for each stratum.

Arguments

n

(integerish(1))
total sample size. Must satisfy n > 0. Additionally:

  • If bounds_inner is not NULL, then n >= sum(bounds_inner) when bounds_inner are treated as lower bounds, or n <= sum(bounds_inner) when treated as upper bounds.

  • If bounds_outer is not NULL, then n >= sum(bounds_outer) when bounds_outer are treated as lower bounds, or n <= sum(bounds_outer) when treated as upper bounds.

A

(numeric)
population constants \(A_1,\ldots,A_H\). All values must be strictly positive.

m

(numeric or NULL)
optional lower bounds \(m_1,\ldots,m_H\) for the stratum sample sizes. If no lower bounds are desired, set m = NULL. If M is not NULL, it is required that \(m_h < M_h\) for all strata.

M

(numeric or NULL)
optional upper bounds \(M_1,\ldots,M_H\) for the stratum sample sizes. If no upper bounds are desired, set M = NULL. If m is not NULL, it is required that \(m_h < M_h\) for all strata.

M_algorithm

(string)
Name of the algorithm to use for computing the sample allocation when only upper-bound constraints are imposed. Must be one of "rna" (default), "sga", "sgaplus", or "coma". This parameter is used only when \(H > 1\) and n < sum(M).

Details

The opt() function uses different allocation algorithms depending on which inequality constraints are applied. Each algorithm is implemented in a separate R function, which is generally not intended to be called directly by the end user. The algorithms are:

  • Lower bounds only (\(m_1,\, \ldots,\, m_H\)):

    • LRNA - rna()

  • Upper bounds only (\(M_1,\, \ldots,\, M_H\)):

    • RNA - rna()

    • SGA - sga()

    • SGAPLUS - sgaplus()

    • COMA - coma()

  • Box constraints (\(m_h, M_h,\, h = 1,\ldots,H\)):

    • RNABOX - rnabox()

See the documentation of each specific function for more details about the corresponding algorithm.

References

Sarndalstratallo

See Also

optcost(), rna(), sga(), sgaplus(), coma(), rnabox().

Examples

Run this code
A <- c(3000, 4000, 5000, 2000)
m <- c(100, 90, 70, 50)
M <- c(300, 400, 200, 90)

# One-sided lower bounds.
opt(n = 340, A = A, m = m)
opt(n = 400, A = A, m = m)
opt(n = 700, A = A, m = m)

# One-sided upper bounds.
opt(n = 190, A = A, M = M)
opt(n = 700, A = A, M = M)

# Box-constraints.
opt(n = 340, A = A, m = m, M = M)
opt(n = 500, A = A, m = m, M = M)
x <- opt(n = 800, A = A, m = m, M = M)
x

# Variance corresponding to the allocation x.
var_st(x = x, A = A, A0 = 45000)

# Execution-time comparison of different algorithms using the microbenchmark package.
if (FALSE) {
N <- pop969s_ucost[, "N"]
S <- pop969s_ucost[, "S"]
A <- N * S
nfrac <- c(0.005, seq(0.05, 0.95, 0.05))
n <- setNames(as.integer(nfrac * sum(N)), nfrac)
lapply(
  n,
  function(ni) {
    microbenchmark::microbenchmark(
      RNA = opt(ni, A, M = N, M_algorithm = "rna"),
      SGA = opt(ni, A, M = N, M_algorithm = "sga"),
      SGAPLUS = opt(ni, A, M = N, M_algorithm = "sgaplus"),
      COMA = opt(ni, A, M = N, M_algorithm = "coma"),
      times = 200,
      unit = "us"
    )
  }
)
}

Run the code above in your browser using DataLab