optimStrat (version 2.1)

expmse: Anticipated Mean Squared Error

Description

Compute the anticipated Mean Squared Error of five sampling strategies.

Usage

expmse(b, d, x, n, H, Rxy, estrato1 = NULL, estrato2 = NULL, st = 1:5,
   short = FALSE)

Arguments

b

a numeric vector of length two giving the true shapes of the trend and spread terms.

d

a numeric vector of length two giving the assumed shapes of the trend and spread terms.

x

a positive numeric vector giving the values of the auxiliary variable.

n

a positive integer indicating the desired sample size.

H

a positive integer smaller or equal than length(x) giving the desired number of strata/poststrata. Ignored if estrato1 and estrato2 are given.

Rxy

a number giving the correlation between the auxiliary variable and the study variable.

estrato1

a list giving stratum and sample sizes per stratum (see ‘Details’).

estrato2

a list giving stratum and sample sizes per stratum (see ‘Details’).

st

a numeric vector indicating the strategies for which the anticipated MSE is to be calculated (see ‘Details’).

short

logical. If FALSE (the default) a vector of length five is returned. If TRUE only the strategies given by st are returned.

Value

If short=FALSE a vector of length five is returned giving the anticipated MSE of the strategies given in st. NA is returned for those strategies not given in st. If short=TRUE, the NAs are omitted.

Details

The Anticipated Mean Squared Error of a sample of size n is computed for five sampling strategies (\(\pi\)ps--reg, STSI--reg, STSI--HT, \(\pi\)ps--pos and STSI--pos).

The strategies are defined assuming that the underlying superpopulation model is of the form $$Y_{k}=\delta_{0}+\delta_{1}x_{k}^{\delta_{2}}+\epsilon_{k}$$ with \(E\epsilon_{k}=0\), \(V\epsilon_{k}=\delta_{3}^{2}x_{k}^{2\delta_{4}}\) and \(Cov(\epsilon_{k},\epsilon_{l})=0\). But the true generating model is of the form $$Y_{k}=\beta_{0}+\beta_{1}x_{k}^{\beta_{2}}+\epsilon_{k}$$ with \(E\epsilon_{k}=0\), \(V\epsilon_{k}=\beta_{3}^{2}x_{k}^{2\beta_{4}}\) and \(Cov(\epsilon_{k},\epsilon_{l})=0\).

The parameters \(\beta_2\) and \(\beta_4\) are given by b. The parameters \(\delta_2\) and \(\delta_4\) are given by d.

estrato1 and estrato2 are lists with two components (each with length length(x)): stratum indicates the stratum to which each element belongs and nh indicates the sample sizes to be selected in each stratum. They can be created via optiallo. estrato1 gives the stratification for STSI--HT and the poststrata for \(\pi\)ps--pos and STSI--pos; whereas estrato2 gives the stratification for STSI--reg and STSI--pos. If NULL, optiallo is used for defining H strata/poststrata.

st indicates which MSEs to be calculated. If 1 in st, the anticipated MSE of \(\pi\)ps--reg is calculated. If 2 in st, the anticipated MSE of STSI--reg is calculated, and so on.

References

Bueno, E. (2018). A Comparison of Stratified Simple Random Sampling and Probability Proporional-to-size Sampling. Research Report, Department of Statistics, Stockholm University 2018:6. http://gauss.stat.su.se/rr/RR2018_6.pdf.

See Also

optiallo for how to stratify an auxiliary variable and allocate the sample size; desmse for calculating the MSE of the five strategies.

Examples

Run this code
# NOT RUN {
x<- 1 + sort( rgamma(5000, shape=4/9, scale=108) )
expmse(b=c(1,1),d=c(1,1),x,n=500,H=6,Rxy=0.9)
expmse(b=c(1,1),d=c(1,1),x,n=500,H=6,Rxy=0.9,st=1:3)
expmse(b=c(1,1),d=c(1,1),x,n=500,H=6,Rxy=0.9,st=1:3,short=TRUE)

stratum<- optiallo(n=500,x,H=6)
poststratum<- optiallo(n=500,x^1.5,H=10)
expmse(b=c(1,1),d=c(1,1),x,n=500,H=6,Rxy=0.9,
   estrato1=poststratum,estrato2=stratum)
# }

Run the code above in your browser using DataCamp Workspace