Learn R Programming

jipApprox (version 0.1.5)

jip_approx: Approximate Joint-Inclusion Probabilities

Description

Approximations of joint-inclusion probabilities by means of first-order inclusion probabilities.

Usage

jip_approx(pik, method)

Value

A symmetric matrix of inclusion probabilities, which diagonal is the vector of first-order inclusion probabilities.

Arguments

pik

numeric vector of first-order inclusion probabilities for all population units.

method

string representing one of the available approximation methods.

Details

Available methods are "Hajek", "HartleyRao", "Tille", "Brewer1","Brewer2","Brewer3", and "Brewer4". Note that these methods were derived for high-entropy sampling designs, therefore they could have low performance under different designs.

Hájek (1964) approximation [method="Hajek"] is derived under Maximum Entropy sampling design and is given by

$$\tilde{\pi}_{ij} = \pi_i\pi_j \frac{1 - (1-\pi_i)(1-\pi_j)}{d} $$ where \(d = \sum_{i\in U} \pi_i(1-\pi_i) \)

Hartley and Rao (1962) proposed the following approximation under randomised systematic sampling [method="HartleyRao"]:

$$\tilde{\pi}_{ij} = \frac{n-1}{n} \pi_i\pi_j + \frac{n-1}{n^2} (\pi_i^2 \pi_j + \pi_i \pi_j^2) - \frac{n-1}{n^3}\pi_i\pi_j \sum_{i\in U} \pi_j^2$$

$$ + \frac{2(n-1)}{n^3} (\pi_i^3 \pi_j + \pi_i\pi_j^3 + \pi_i^2 \pi_j^2) - \frac{3(n-1)}{n^4} (\pi_i^2 \pi_j + \pi_i\pi_j^2) \sum_{i \in U}\pi_i^2$$

$$+ \frac{3(n-1)}{n^5} \pi_i\pi_j \biggl( \sum_{i\in U} \pi_i^2 \biggr)^2 - \frac{2(n-1)}{n^4} \pi_i\pi_j \sum_{i \in U} \pi_j^3 $$

Tillé (1996) proposed the approximation \(\tilde{\pi}_{ij} = \beta_i\beta_j\), where the coefficients \(\beta_i\) are computed iteratively through the following procedure [method="Tille"]:

  1. \(\beta_i^{(0)} = \pi_i, \,\, \forall i\in U\)

  2. \( \beta_i^{(2k-1)} = \frac{(n-1)\pi_i}{\beta^{(2k-2)} - \beta_i^{(2k-2)}} \)

  3. \(\beta_i^{2k} = \beta_i^{(2k-1)} \Biggl( \frac{n(n-1)}{(\beta^(2k-1))^2 - \sum_{i\in U} (\beta_k^{(2k-1)})^2 } \Biggr)^(1/2) \)

with \(\beta^{(k)} = \sum_{i\in U} \beta_i^{i}, \,\, k=1,2,3, \dots \)

Finally, Brewer (2002) and Brewer and Donadio (2003) proposed four approximations, which are defined by the general form

$$\tilde{\pi}_{ij} = \pi_i\pi_j (c_i + c_j)/2 $$

where the \(c_i\) determine the approximation used:

  • Equation (9) [method="Brewer1"]: $$c_i = (n-1) / (n-\pi_i)$$

  • Equation (10) [method="Brewer2"]: $$c_i = (n-1) / \Bigl(n- n^{-1}\sum_{i\in U}\pi_i^2 \Bigr)$$

  • Equation (11) [method="Brewer3"]: $$c_i = (n-1) / \Bigl(n - 2\pi_i + n^{-1}\sum_{i\in U}\pi_i^2 \Bigr)$$

  • Equation (18) [method="Brewer4"]: $$c_i = (n-1) / \Bigl(n - (2n-1)(n-1)^{-1}\pi_i + (n-1)^{-1}\sum_{i\in U}\pi_i^2 \Bigr)$$

References

Hartley, H.O.; Rao, J.N.K., 1962. Sampling With Unequal Probability and Without Replacement. The Annals of Mathematical Statistics 33 (2), 350-374.

Hájek, J., 1964. Asymptotic Theory of Rejective Sampling with Varying Probabilities from a Finite Population. The Annals of Mathematical Statistics 35 (4), 1491-1523.

Tillé, Y., 1996. Some Remarks on Unequal Probability Sampling Designs Without Replacement. Annals of Economics and Statistics 44, 177-189.

Brewer, K.R.W.; Donadio, M.E., 2003. The High Entropy Variance of the Horvitz-Thompson Estimator. Survey Methodology 29 (2), 189-196.

Examples

Run this code

### Generate population data ---
N <- 20; n<-5

set.seed(0)
x <- rgamma(N, scale=10, shape=5)
y <- abs( 2*x + 3.7*sqrt(x) * rnorm(N) )

pik  <- n * x/sum(x)

### Approximate joint-inclusion probabilities ---
pikl <- jip_approx(pik, method='Hajek')
pikl <- jip_approx(pik, method='HartleyRao')
pikl <- jip_approx(pik, method='Tille')
pikl <- jip_approx(pik, method='Brewer1')
pikl <- jip_approx(pik, method='Brewer2')
pikl <- jip_approx(pik, method='Brewer3')
pikl <- jip_approx(pik, method='Brewer4')



Run the code above in your browser using DataLab