The function strata.bh
stratifies a population given a set of boundaries. It calculates the stratum sample sizes and the anticipated coefficient of variation or relative root mean squared error.
strata.bh(x, bh, n = NULL, CV = NULL, Ls = 3, certain = NULL,
alloc = list(q1 = 0.5, q2 = 0, q3 = 0.5), takenone = 0,
bias.penalty = 1, takeall = 0, takeall.adjust = TRUE,
rh = rep(1, Ls), model = c("none", "loglinear", "linear",
"random"), model.control = list())
A vector containing the values of the stratification variable \(X\) for every unit in the population.
A vector of the \(L-1\) stratum boundaries \((b_1, b_2, \ldots, b_{L-1})\) where \(L\) is the total number of strata (excluding the certainty stratum, if any). Therefore, if takenone=0
then \(L\)=Ls
, and if takenone=1
then \(L\)=Ls
+1.
A numeric: the target sample size. It has no default value. The argument n
or the argument CV
must be input.
A numeric: the target coefficient of variation or relative root mean squared error if takenone
=1. It has no default value. The argument CV
or the argument n
must be input.
A numeric: the number of sampled strata (take-none and certain strata are not counted in Ls
). The default is 3.
A vector giving the position, in the vector x
, of the units that must be included in the sample (see stratification-package
). By default certain
is NULL
, which means that no units are a priori chosen to be in the sample.
A list specifying the allocation scheme. The list must contain 3 numerics for the 3 exponents q1
, q2
and q3
in the general allocation scheme (see stratification-package
). The default is Neyman allocation (q1
=q3
=0.5 and q2
=0)
A numeric: the number of take-none strata (0 or 1). The default is 0, i.e. no take-none stratum is included.
A numeric between 0 and 1 giving the penalty for the bias in the anticipated mean squared error (MSE) of the survey estimator (see stratification-package
). This argument is relevant only if takenone
=1. The default is 1.
A numeric: the number of take-all strata (one of {0, 1, ..., Ls
-1}). The default is 0, i.e. no take-all stratum is included.
A logical. If TRUE
(the default), when \(n_h > N_h\) for a take-some stratum, the takeall
argument is increased by one and the allocation is carried out again. This is done as long as \(n_h \leq N_h\) for every take-some stratum. If FALSE
, no adjustment is made. Note: in other functions of the package stratification, this adjustment is not optional; it is made automatically (see stratification-package
).
A vector giving the anticipated response rates in each of the Ls
sampled strata. A single number can be given if the rates do not vary among strata. The default is 1 in each stratum.
A character string identifying the model used to describe the discrepancy between the stratification variable \(X\) and the survey variable \(Y\). It can be "none"
if one assumes \(Y=X\), "loglinear"
for the loglinear model with mortality, "linear"
for the heteroscedastic linear model or "random"
for the random replacement model (see stratification-package
for a description of these models). The default is "none"
.
A list of model parameters (see stratification-package
). The default values of the parameters correspond to the model \(Y=X\).
A vector of length \(L\) containing the population sizes \(N_h\), i.e. the number of units in each stratum.
A vector of length \(L\) containing the sample sizes \(n_h\), i.e. the number of units to sample in each stratum. See stratification-package
for information about the rounding used to get these integer values.
The total sample size (sum(nh)
).
A vector of length \(L\) containing the non-integer values of the sample sizes, obtained directly from applying the allocation rule (see stratification-package
).
A vector giving statistics for the certainty stratum (see stratification-package
). It contains Nc
, the number of units chosen a priori to be in the sample, and meanc
, the anticipated mean of \(Y\) for these units.
The final value of the criteria to optimize (either the total sample size \(n\) if a target CV
was given or the RRMSE if a target n
was given) calculated with the integer stratum sample sizes nh
.
The final value of the criteria to optimize (either the total sample size \(n\) if a target CV
was given or the RRMSE if a target n
was given) calculated with the non-integer stratum sample sizes nhnonint
.
A vector of length \(L\) containing the anticipated means of \(Y\) in each stratum.
A vector of length \(L\) containing the anticipated variances of \(Y\) in each stratum.
A numeric: the anticipated global mean value of \(Y\).
A numeric: the root mean squared error (or standard error if takenone
=0) of the anticipated global mean of \(Y\). This is defined as the squared root of: (bias.penalty
x bias of the mean)^2 + variance of the mean.
A numeric: the anticipated relative root mean squared error (or coefficient of variation if takenone
=0) for the mean of \(Y\), i.e. RMSE
divided by mean
.
A numeric: the anticipated relative bias of the estimator, i.e. (bias.penalty
x bias of the mean) divided by mean
. If takenone
=0, this numeric is zero.
A numeric: the proportion of the MSE attributable to the bias of the estimator, i.e. (bias.penalty
x bias of the mean)^2 divided by the MSE of the mean
. If takenone
=0, this numeric is zero.
A factor, having the same length as the input x
, which values are either 1, 2, ..., \(L\) or "certain"
. The value "certain"
is given to units a priori chosen to be in the sample. This factor identifies, for each observation, the stratum to which it has been assigned.
The number of take-all strata in the final solution. Note: It is possible that \(n_h=N_h\) for non take-all strata because the condition for an automatic addition of a take-all stratum is \(n_h>N_h\).
The function call (object of class "call").
A character string that contains the system date and time when the function ended.
A list of all the argument values input to the function or set by default.
Baillargeon, S. and Rivest L.-P. (2011). The construction of stratified designs in R with the package stratification. Survey Methodology, 37(1), 53-65.
print.strata
, plot.strata
, strata.cumrootf
, strata.geo
, strata.LH
# NOT RUN {
adjust <- strata.geo(x=USbanks, CV=0.01, Ls=4, alloc=c(0.35,0.35,0))
adjust
adjust$nhnonint
noadjust <- strata.bh(x=USbanks, bh=adjust$bh, CV=0.01, Ls=4,
alloc=c(0.35,0.35,0), takeall=0, takeall.adjust=FALSE)
noadjust
noadjust$nhnonint
# without the adjustment for a take-all stratum, n is smaller than
# with the adjustment, but the target CV is not reached.
# }
Run the code above in your browser using DataLab