Learn R Programming

vardpoor (version 0.3.6)

variance_est: Variance estimation for sample surveys by the ultimate cluster method

Description

Computes the variance estimation by the ultimate cluster method.

Usage

variance_est(Y, H, PSU, w_final, N_h=NULL, fh_zero=FALSE,
                    PSU_level=TRUE, period=NULL, dataset=NULL)

Arguments

Y
either a numeric data.frame, matrix, data.table with column names giving the variables of interest, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset' column
H
either 1 column data.frame, matrix, data.table with column name giving elements indicating the unit stratum, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset
PSU
either 1 column data.frame, matrix, data.table giving primary sampling unit, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset' column count) specifying the co
w_final
either a numeric vector, 1 column data.frame, matrix, data.table giving the final weights, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset' column count) sp
N_h
either a matrix giving the first column - stratum, but the second column - the total of the population in each stratum.
fh_zero
by default FALSE; fh is calculated as division of n_h and N_h in each strata, if true, fh value is zero in each strata.
PSU_level
by default TRUE; if PSU_level is true, in each strata fh is calculated as division of count of PSU in sample (n_h) and count of PSU in frame(N_h). if PSU_level is false, in each strata fh is calculated as division of count of units in sample (n_
period
optional; either a data.frame, matrix, data.table with column names giving different periods, or (if dataset is not NULL) character strings, integers or a logical vectors (length is the same as 'dataset' column coun
dataset
an optional name of the individual dataset data.frame.

Value

  • a data.table containing the values of the variance estimation by totals.

Details

If we assume that $n_h \geq 2$ for all $h$, that is, two or more PSUs are selected from each stratum, then the variance of $\hat{\theta}$ can be estimated from the variation among the estimated PSU totals of the variable $Z$: $$\hat{V} \left(\hat{\theta} \right)=\sum\limits_{h=1}^{H} \left(1-f_h \right) \frac{n_h}{n_{h}-1} \sum\limits_{i=1}^{n_h} \left( z_{hi\bullet}-\bar{z}_{h\bullet\bullet}\right)^2,$$ where $\bullet$ $z_{hi\bullet}=\sum\limits_{j=1}^{m_{hi}} \omega_{hij} z_{hij}$ $\bullet$ $\bar{z}_{h\bullet\bullet}=\frac{\left( \sum\limits_{i=1}^{n_h} z_{hi\bullet} \right)}{n_h}$ $\bullet$ $f_h$ is the sampling fraction of PSUs within stratum $\bullet$ $h$ is the stratum number, with a total of H strata $\bullet$ $i$ is the primary sampling unit (PSU) number within stratum $h$, with a total of $n_h$ PSUs $\bullet$ $j$ is the household number within cluster $i$ of stratum $h$, with a total of $m_{hi}$ household $\bullet$ $w_{hij}$ is the sampling weight for household $j$ in PSU $i$ of stratum $h$ $\bullet$ $z_{hij}$ denotes the observed value of the analysis variable $z$ for household $j$ in PSU $i$ of stratum $h$

References

Eurostat Methodologies and Working papers, Standard error estimation for the EU-SILC indicators of poverty and social exclusion, 2013, URL http://ec.europa.eu/eurostat/documents/3859598/5927001/KS-RA-13-029-EN.PDF. Yves G. Berger, Tim Goedeme, Guillame Osier (2013). Handbook on standard error estimation and other related sampling issues in EU-SILC, URL http://www.cros-portal.eu/content/handbook-standard-error-estimation-and-other-related-sampling-issues-ver-29072013 Guillaume Osier and Emilio Di Meglio. The linearisation approach implemented by Eurostat for the first wave of EU-SILC: what could be done from the second onwards? 2012

See Also

domain, lin.ratio, linarpr, linarpt, lingini, lingini2, lingpg, linpoormed, linqsr, linrmpg, residual_est, vardom, vardomh, varpoord, variance_othstr

Examples

Run this code
Ys <- rchisq(10, 3)
w <- rep(2, 10)
PSU <- 1:length(Ys)
H <- rep("Strata_1", 10)
variance_est(Y=Ys, H=H, PSU=PSU, w_final=w)

Run the code above in your browser using DataLab