HorvitzUB: Horvitz-UB model

Description

Computes the randomized response estimation, its variance estimation and its confidence interval through the Horvitz model (Horvitz et al., 1967, and Greenberg et al., 1969) when the proportion of people bearing the innocuous attribute is unknown. The function can also return the transformed variable. The Horvitz-UB model can be seen in Chaudhuri (2011, page 42).

Usage

HorvitzUB(I,J,p1,p2,pi,type=c("total","mean"),cl,N=NULL,pij=NULL)

Arguments

first vector of the observed variable; its length is equal to $n$ (the sample size)

second vector of the observed variable; its length is equal to $n$ (the sample size)

proportion of marked cards with the sensitive attribute in the first box

proportion of marked cards with the sensitive attribute in the second box

vector of the first-order inclusion probabilities

type

the estimator type: total or mean

confidence level

size of the population. By default it is NULL

pij

matrix of the second-order inclusion probabilities. By default it is NULL

Value

Point and confidence estimates of the sensitive characteristics using the Horvitz-UB model. The transformed variable is also reported, if required.

Details

In the Horvitz model, when the population proportion $\alpha$ is not known, two independent samples are taken. Two boxes are filled with a large number of similar cards except that in the first box a proportion $p_1(0<p_1<1)$ of them is marked $A$ and the complementary proportion $(1-p_1)$ each bearing the mark $B$, while in the second box these proportions are $p_2$ and $1-p_2$, maintaining $p_2$ different from $p_1$. A sample is chosen and every person sampled is requested to draw one card randomly from the first box and to repeat this independently with the second box. In the first case, a randomized response should be given, as $$I_i=\left\{\begin{array}{lcc} 1 & \textrm{if card type draws "matches" the sensitive trait } A \textrm{ or the innocuous trait } B \\ 0 & \textrm{if there is "no match" with the first box } \end{array} \right.$$ and the second case given a randomized response as $$J_i=\left\{\begin{array}{lcc} 1 & \textrm{if there is "match" for the second box} \\ 0 & \textrm{if there is "no match" for the second box} \end{array} \right.$$ The transformed variable is $r_i=\frac{(1-p_2)I_i-(1-p_1)J_i}{p_1-p_2}$ and the estimated variance is $\widehat{V}_R(r_i)=r_i(r_i-1)$.

References

Chaudhuri, A. (2011). Randomized response and indirect questioning techniques in surveys. Boca Raton: Chapman and Hall, CRC Press.

Greenberg, B.G., Abul-Ela, A.L., Simmons, W.R., Horvitz, D.G. (1969). The unrelated question RR model: Theoretical framework. Journal of the American Statistical Association, 64, 520-539.

Horvitz, D.G., Shah, B.V., Simmons, W.R. (1967). The unrelated question RR model. Proceedings of the Social Statistics Section of the American Statistical Association. 65-72. Alexandria, VA: ASA.

Examples

Run this code

# NOT RUN {
N=802
data(HorvitzUBData)
dat=with(HorvitzUBData,data.frame(I,J,Pi))
p1=0.6
p2=0.7
cl=0.95
HorvitzUB(dat$I,dat$J,p1,p2,dat$Pi,"mean",cl,N)
# }

Run the code above in your browser using DataLab