Inactive replicates are also sometimes referred to as "dead replicates",
for example in Ash (2014). The purpose of adding inactive replicates
is to increase the number of columns of replicate weights without impacting
variance estimates. This can be useful, for example, when combining data
from a survey across multiple years, where different years use different number
of replicates, but a consistent number of replicates is desired in the combined
data file.
Suppose the initial replicate design has \(L\) replicates, with
respective constants \(c_k\) for \(k=1,\dots,L\) used to estimate variance
with the formula
$$v_{R} = \sum_{k=1}^L c_k\left(\hat{T}_y^{(k)}-\hat{T}_y\right)^2$$
where \(\hat{T}_y\) is the estimate produced using the full-sample weights
and \(\hat{T}_y^{(k)}\) is the estimate from replicate \(k\).
Inactive replicates are simply replicates that are exactly equal to the full sample:
that is, the replicate \(k\) is called "inactive" if its vector of replicate
weights exactly equals the full-sample weights. In this case, when using the formula
above to estimate variances, these replicates contribute nothing to the variance estimate.
If the analyst uses the variant of the formula above where the full-sample estimate
\(\hat{T}_y\) is replaced by the average replicate estimate (i.e., \(L^{-1}\sum_{k=1}^{L}\hat{T}_y^{(k)}\)),
then variance estimates will differ before vs. after adding the inactive replicates.
For this reason, it is strongly recommend to explicitly specify mse=TRUE
when creating a replicate design object in R with functions such as svrepdesign()
,
as_bootstrap_design()
, etc. If working with an already existing replicate design,
you can update the mse
option to TRUE
simply by using code such as
my_design$mse <- TRUE
.