Learn R Programming

TeachingSampling (version 4.1.1)

PikHol: Optimal Inclusion Probabilities Under Multi-purpose Sampling

Description

Computes the population vector of optimal inclusion probabilities under the Holmbergs's Approach

Usage

PikHol(n, sigma, e, Pi)

Arguments

n

Vector of optimal sample sizes for each of the characteristics of interest.

sigma

A matrix containing the size measures for each characteristics of interest.

e

Maximum allowed error under the ANOREL approach.

Pi

Matrix of first order inclusion probabilities. By default, this probabilites are proportional to each sigma.

Value

The function returns a vector of inclusion probabilities.

Details

Assuming that all of the characteristic of interest are equally important, the Holmberg's sampling design yields the following inclusion probabilities $$\pi_{(opt)k}=\frac{n^*\sqrt{a_{qk}}}{\sum_{k\in U}\sqrt{a_{qk}}}$$ where $$n^*\geq \frac{(\sum_{k\in U}\sqrt{a_{qk}})^2}{(1+c)Q+\sum_{k\in U}a_{qk}}$$ and $$a_{qk}= \sum_{q=1}^Q \frac{\sigma^2_{qk}}{\sum_{k\in U}\left( \frac{1}{\pi_{qk}}-1\right)\sigma^2_{qk}}$$ Note that \(\sigma^2_{qk}\) is a size measure associated with the k-th element in the q-th characteristic of interest.

References

Holmberg, A. (2002), On the Choice of Sampling Design under GREG Estimation in Multiparameter Surveys. RD Department, Statistics Sweden. Sarndal, C-E. and Swensson, B. and Wretman, J. (1992), Model Assisted Survey Sampling. Springer. Gutierrez, H. A. (2009), Estrategias de muestreo: Diseno de encuestas y estimacion de parametros. Editorial Universidad Santo Tomas

Examples

Run this code
# NOT RUN {
#######################
#### First example ####
#######################

# Uses the Lucy data to draw an otpimal sample
# in a multipurpose survey context
data(Lucy)
attach(Lucy)
# Different sample sizes for two characteristics of interest: Employees and Taxes
N <- dim(Lucy)[1]
n <- c(350,400)
# The size measure is the same for both characteristics of interest,
# but the relationship in between is different
sigy1 <- sqrt(Income^(1))
sigy2 <- sqrt(Income^(2))
# The matrix containign the size measures for each characteristics of interest
sigma<-cbind(sigy1,sigy2)
# The vector of optimal inclusion probabilities under the Holmberg's approach
Piks<-PikHol(n,sigma,0.03)
# The optimal sample size is given by the sum of piks
n=round(sum(Piks))
# Performing the S.piPS function in order to select the optimal sample of size n
res<-S.piPS(n,Piks)
sam <- res[,1]
# The information about the units in the sample is stored in an object called data
data <- Lucy[sam,]
attach(data)
names(data)
# Pik.s is the vector of inclusion probability of every single unit
# in the selected sample
Pik.s <- res[,2]
# The variables of interest are: Income, Employees and Taxes
# This information is stored in a data frame called estima
estima <- data.frame(Income, Employees, Taxes)
E.piPS(estima,Pik.s)

########################
#### Second example ####
########################

# We can define our own first inclusion probabilities
data(Lucy)
attach(Lucy)

N <- dim(Lucy)[1]
n <- c(350,400)

sigy1 <- sqrt(Income^(1))
sigy2 <- sqrt(Income^(2))
sigma<-cbind(sigy1,sigy2)
pikas <- cbind(rep(400/N, N), rep(400/N, N))

Piks<-PikHol(n,sigma,0.03, pikas)

n=round(sum(Piks))
n

res<-S.piPS(n,Piks)
sam <- res[,1]

data <- Lucy[sam,]
attach(data)
names(data)

Pik.s <- res[,2]
estima <- data.frame(Income, Employees, Taxes)
E.piPS(estima,Pik.s)
# }

Run the code above in your browser using DataLab