Learn R Programming

stratallo (version 2.2.1)

opt_1sided: Algorithms for Optimum Sample Allocation Under One-Sided Bounds

Description

[Stable]

Functions that implement selected optimal allocation algorithms that compute a solution to the optimal allocation problem defined in the language of mathematical optimization as follows.

Minimize $$f(x_1,\ldots,x_H) = \sum_{h=1}^H \frac{A^2_h}{x_h}$$ subject to $$\sum_{h=1}^H c_h x_h = c$$ and either $$x_h \leq M_h, \quad h = 1,\ldots,H$$ or $$x_h \geq m_h, \quad h = 1,\ldots,H,$$ where \(c > 0,\, c_h > 0,\, A_h > 0,\, m_h > 0,\, M_h > 0,\, h = 1,\ldots,H\), are given numbers. The minimization is on \(\mathbb R_+^H\).

The inequality constraints are optional and user can choose whether and how they are to be added to the optimization problem. If one-sided lower bounds \(m_h,\, h = 1,\ldots,H\), must be imposed, it is then required that \(c \geq \sum_{h=1}^H c_h m_h\). If one-sided upper bounds \(M_h,\, h = 1,\ldots,H\), must be imposed, it is then required that \(0 < c \leq \sum_{h=1}^H c_h M_h\). Lower bounds can be specified instead of the upper bounds only in case of the LRNA algorithm. All other algorithms allow only for specification of the upper bounds. For the sake of clarity, we emphasize that in the optimization problem consider here, the lower and upper bounds cannot be imposed jointly.

Costs \(c_h,\, h = 1,\ldots,H\), of surveying one element in stratum, can be specified by the user only in case of the RNA and LRNA algorithms. For remaining algorithms, these costs are fixed at 1, i.e. \(c_h = 1,\, h = 1,\ldots,H\).

The following is the list of all the algorithms available to use along with the name of the function that implements a given algorithm. See the description of a specific function to find out more about the corresponding algorithm.

  • RNA - rna()

  • LRNA- rna()

  • SGA- sga()

  • SGAPLUS - sgaplus()

  • COMA - coma()

Functions in this family should not be called directly by the user. Use opt() or optcost() instead.

Usage

rna(
  total_cost,
  A,
  bounds = NULL,
  unit_costs = 1,
  check_violations = .Primitive(">="),
  details = FALSE
)

sga(total_cost, A, M)

sgaplus(total_cost, A, M)

coma(total_cost, A, M)

Value

Numeric vector with optimal sample allocations in strata. In case of the rna() only, it can also be a list with optimal sample allocations and strata assignments (either to take-Neyman or take-bound).

Arguments

total_cost

(number)
total cost \(c\) of the survey. A strictly positive scalar.

A

(numeric)
population constants \(A_1,\ldots,A_H\). Strictly positive numbers.

bounds

(numeric or NULL)
optional lower bounds \(m_1,\ldots,m_H\), or upper bounds \(M_1,\ldots,M_H\), or NULL to indicate that there is no inequality constraints in the optimization problem considered. If not NULL, the bounds is to be treated either as:

  • lower bounds, if check_violations = .Primitive("<="). In this case, it is required that total_cost >= sum(unit_costs * bounds),
    or

  • upper bounds, if check_violations = .Primitive(">="). In this case, it is required that total_cost <= sum(unit_costs * bounds).

unit_costs

(numeric)
costs \(c_1,\ldots,c_H\), of surveying one element in stratum. A strictly positive numbers. Can be also of length 1, if all unit costs are the same for all strata. In this case, the elements will be recycled to the length of bounds.

check_violations

(function)
2-arguments binary operator function that allows the comparison of values in atomic vectors. It must either be set to .Primitive("<=") or .Primitive(">="). The first of these choices causes that bounds are treated as lower bounds and then rna() function performs the LRNA algorithm. The latter option causes that bounds are treated as upper bounds, and then rna() function performs the RNA algorithm. This argument is ignored when bounds is set to NULL.

details

(flag)
should detailed information about strata assignments (either to take-Neyman or take-bound), values of set function \(s\) and number of iterations be added to the output?

M

(numeric or NULL)
upper bounds \(M_1,\ldots,M_H\), optionally imposed on sample sizes in strata. If no upper bounds should be imposed, then M must be set to NULL. Otherwise, it is required that total_cost <= sum(unit_costs * M). Strictly positive numbers.

Functions

  • rna(): Recursive Neyman Algorithm (RNA) and its twin version, Lower Recursive Neyman Algorithm (LRNA) dedicated to the allocation problem with one-sided lower-bounds constraints. The RNA is described in Wesołowski et al. (2021), while LRNA is introduced in Wójciak (2023).

  • sga(): Stenger-Gabler type algorithm SGA, described in Wesołowski et al. (2021) and in Stenger and Gabler (2005). This algorithm solves the problem with one-sided upper-bounds constraints. It also assumes unit costs are constant and equal to 1, i.e. \(c_h = 1,\, h = 1,\ldots,H\).

  • sgaplus(): modified Stenger-Gabler type algorithm, described in Wójciak (2019) as Sequential Allocation (version 1) algorithm. This algorithm solves the problem with one-sided upper-bounds constraints. It also assumes unit costs are constant and equal to 1, i.e. \(c_h = 1,\, h = 1,\ldots,H\).

  • coma(): Change of Monotonicity Algorithm (COMA), described in Wesołowski et al. (2021). This algorithm solves the problem with one-sided upper-bounds constraints. It also assumes unit costs are constant and equal to 1, i.e. \(c_h = 1,\, h = 1,\ldots,H\).

References

Wójciak, W. (2023). Another Solution of Some Optimum Allocation Problem. Statistics in Transition new series, 24(5) (in press). https://arxiv.org/abs/2204.04035

Wesołowski, J., Wieczorkowski, R., Wójciak, W. (2021). Optimality of the Recursive Neyman Allocation. Journal of Survey Statistics and Methodology, 10(5), pp. 1263–1275. tools:::Rd_expr_doi("10.1093/jssam/smab018"), tools:::Rd_expr_doi("10.48550/arXiv.2105.14486")

Wójciak, W. (2019). Optimal Allocation in Stratified Sampling Schemes. MSc Thesis, Warsaw University of Technology, Warsaw, Poland. http://home.elka.pw.edu.pl/~wwojciak/msc_optimal_allocation.pdf

Stenger, H., Gabler, S. (2005). Combining random sampling and census strategies - Justification of inclusion probabilities equal to 1. Metrika, 61(2), pp. 137–156. tools:::Rd_expr_doi("10.1007/s001840400328")

Särndal, C.-E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling, Springer, New York.

See Also

opt(), optcost(), rnabox().

Examples

Run this code
A <- c(3000, 4000, 5000, 2000)
m <- c(50, 40, 10, 30) # lower bounds
M <- c(100, 90, 70, 80) # upper bounds

rna(total_cost = 190, A = A, bounds = M)
rna(total_cost = 190, A = A, bounds = m, check_violations = .Primitive("<="))
sga(total_cost = 190, A = A, M = M)
sgaplus(total_cost = 190, A = A, M = M)
coma(total_cost = 190, A = A, M = M)

Run the code above in your browser using DataLab