variance_est(Y, H, PSU, w_final, N_h = NULL, fh_zero = FALSE, PSU_level = TRUE, PSU_sort = NULL, period = NULL, dataset = NULL, msg = "")
data.table
or variable names as character, column numbers.data.table
or variable name as character, column number.data.table
or variable name as character, column number.data.table
or variable name as character, column number.period
is not NULL
). If N_h = NULL
and fh_zero = FALSE
(default), N_h
is estimated from sample data as sum of weights (w_final
) in each stratum (and period if period
is not NULL
). Optional for single-stage sampling design as it will be estimated from sample data. Recommended for multi-stage sampling design as N_h
can not be correctly estimated from the sample data in this case. If N_h
is not used in case of multi-stage sampling design (for example, because this information is not available), it is advisable to set fh_zero = TRUE
.
If period
is NULL
. A two-column matrix with rows for each stratum. The first column should contain stratum code. The second column - the number of primary sampling units in the population of each stratum.
If period
is not NULL
. A three-column matrix with rows for each intersection of strata and period. The first column should contain period. The second column should contain stratum code. The third column - the number of primary sampling units in the population of each stratum and period.
data.table
or variable names as character, column numbers.data.table
.data.table
containing the values of the
variance estimation by totals.
$$\hat{V} \left(\hat{\theta} \right)=\sum\limits_{h=1}^{H} \left(1-f_h \right) \frac{n_h}{n_{h}-1} \sum\limits_{i=1}^{n_h} \left( z_{hi\bullet}-\bar{z}_{h\bullet\bullet}\right)^2, $$
where
$$ $z_hi.=\sum j=1...m_hi \omega_hij * z_hij$
$$ $z_h..=(\sum i=1...n_h z_hi.)/n_h$
$$ $f_h$ is the sampling fraction of PSUs within stratum
$$ $h$ is the stratum number, with a total of H strata
$$ $i$ is the primary sampling unit (PSU) number within stratum $h$, with a total of $n_h$ PSUs
$$ $j$ is the household number within cluster $i$ of stratum $h$, with a total of $m_hi$ household
$$ $w_hij$ is the sampling weight for household $j$ in PSU $i$ of stratum $h$
$$ $z_hij$ denotes the observed value of the analysis variable $z$ for household $j$ in PSU $i$ of stratum $h$
Guillaume Osier and Emilio Di Meglio. The linearisation approach implemented by Eurostat for the first wave of EU-SILC: what could be done from the second onwards? 2012
Eurostat Methodologies and Working papers, Standard error estimation for the EU-SILC indicators of poverty and social exclusion, 2013, URL http://ec.europa.eu/eurostat/documents/3859598/5927001/KS-RA-13-029-EN.PDF.
Yves G. Berger, Tim Goedeme, Guillame Osier (2013). Handbook on standard error estimation and other related sampling issues in EU-SILC, URL https://ec.europa.eu/eurostat/cros/content/handbook-standard-error-estimation-and-other-related-sampling-issues-ver-29072013_en
Eurostat Methodologies and Working papers, Handbook on precision requirements and variance estimation for ESS household surveys, 2013, URL http://ec.europa.eu/eurostat/documents/3859598/5927001/KS-RA-13-029-EN.PDF.
domain
, lin.ratio
, linarpr
,
linarpt
, lingini
, lingini2
,
lingpg
, linpoormed
, linqsr
,
linrmpg
, residual_est
, vardom
,
vardomh
, varpoord
, variance_othstr
Ys <- rchisq(10, 3)
w <- rep(2, 10)
PSU <- 1 : length(Ys)
H <- rep("Strata_1", 10)
# by default without using fh_zero (finite population correction)
variance_est(Y = Ys, H = H, PSU = PSU, w_final = w)
## Not run:
# # without using fh_zero (finite population correction)
# variance_est(Y = Ys, H = H, PSU = PSU, w_final = w, fh_zero = FALSE)
#
# # with using fh_zero (finite population correction)
# variance_est(Y = Ys, H = H, PSU = PSU, w_final = w, fh_zero = TRUE)
# ## End(Not run)
Run the code above in your browser using DataLab