dopt: Multi-Domain Optimum Sample Allocation with Controlled-Precision under Upper-Bound Constraints

Description

Computes the optimum allocation for the following multi-domain optimum allocation problem, formulated in mathematical optimization terms:

Minimize $$f(T,\, \boldsymbol x) = T$$ over $\mathbb R \times \mathbb R_+^{\lvert \mathcal H \rvert}$, subject to $$\sum_{(d,h) \in \mathcal H} x_{d,h} = n,$$ $$\sum_{h \in \mathcal H_d} (\frac{1}{x_{d,h}} - \frac{1}{N_{d,h}}) \frac{N_{d,h}^2 S_{d,h}^2}{\rho_d^2} = T, \qquad d \in \mathcal D,$$ $$x_{d,h} \leq N_{d,h}, \qquad (d,h) \in \mathcal H,$$ where:

$(T,\, \boldsymbol x) = (T,\, (x_{d,h},\, (d,h) \in \mathcal H))$: the optimization variable,
$\mathcal H \subset \mathbb N^2$: the set of domain-stratum indices,
$\mathcal D := \{d \in \mathbb N \colon\; \exists h,\, (d,h) \in \mathcal H\}$: the set of domain indices,
$\mathcal H_d := \{h \in \mathbb N \colon\; (d,h) \in \mathcal H\}$: the set of strata indices in domain $d$,
$N_{d,h} > 0$: size of stratum $(d,h)$,
$S_{d,h} > 0$: standard deviation of the study variable in stratum $(d,h)$,
$\rho_d := t_d\, \sqrt{\kappa_d}$: where $t_d$ denotes the total in domain $d$, i.e., the sum of the values of the study variable for population elements in domain $d$, and $\kappa_d$ is a priority weight for domain $d$,
$n \in (0,\, \sum_{(d,h) \in \mathcal H} N_{d,h}]$: total sample size.

Usage

dopt(n, H_counts, N, S, total, kappa, return_T = FALSE)

Value

If return_T = FALSE (default), a numeric vector containing the optimal sample allocations $x_{d,h}$ for each stratum $(d,h) \in \mathcal H$.

If return_T = TRUE, a list with components:

xopt: numeric vector of optimal sample allocations.
Topt: optimal value of the objective function $T$.

Arguments

n: (integerish(1))
total sample size $n$. Must satisfy 0 < n <= sum(N).
H_counts: (integerish)
strata counts in each domain.
N: (integerish)
strata sizes $(N_{d,h},\, (d,h) \in \mathcal H)$.
S: (numeric)
standard deviations $(S_{d,h},\, (d,h) \in \mathcal H)$ of surveyed variable in strata.
total: (numeric)
vector of domain totals, $t_d,\, d \in \mathcal D$, i.e., the sum of the study variable over all population elements in each domain.
kappa: (numeric)
vector of priority weights for the domains, $\kappa_d,\, d \in \mathcal D$.
return_T: (logical(1))
If TRUE, the function returns a list containing the optimal allocation and the optimal value of the objective function $T$. If FALSE (default), only the optimal allocation vector is returned.

Details

The dopt() function uses the RDCA algorithm implemented in rdca().

References

WojciakPhDstratallo

Examples

Run this code


# Three domains with 2, 2, and 3 strata, respectively,
# that is, H = {(1,1), (1,2), (2,1), (2,2), (3,1), (3,2), (3,3)}.
H_counts <- c(2, 2, 3)
# (N_{1,1}, N_{1,2}, N_{2,1}, N_{2,2}, N_{3,1}, N_{3,2}, N_{3,3})
N <- c(140, 110, 135, 190, 200, 40, 70)
# (S_{1,1}, S_{1,2}, S_{2,1}, S_{2,2}, S_{3,1}, S_{3,2}, S_{3,3})
S <- c(180, 20, 5, 4, 35, 9, 40)
total <- c(2, 3, 5)
kappa <- c(0.5, 0.2, 0.3)
n <- 828

# Optimum allocation.
dopt(n, H_counts, N, S, total, kappa)

# Example population with 9 domains and 278 strata
p <- pop9d278s
sum(p$N)
n <- 5000
x <- dopt(n, p$H_counts, p$N, p$S, p$total, p$kappa, return_T = TRUE)
x
all(x$xopt <= p$N)
sum(x$xopt)

Run the code above in your browser using DataLab