vardomh: Variance estimation for sample surveys in domain for household surveys by the ultimate cluster method

Description

Computes the variance estimation in domain for household.

Usage

vardomh(Y, H, PSU, w_final, ID_household,
               id = NULL, Dom = NULL, period = NULL,
               N_h = NULL, fh_zero = FALSE, PSU_level = TRUE,
               Z = NULL, dataset = NULL,
               X = NULL, periodX = NULL, X_ID_household = NULL,
               ind_gr = NULL, g = NULL, datasetX = NULL,
               q = rep(1, if (is.null(datasetX)) 
                           nrow(data.frame(X)) else nrow(datasetX)),
              confidence = .95,  outp_lin = FALSE,
              outp_res = FALSE)

Arguments

Variables of interest. Object convertible to data.frame or variable names as character, column numbers or logical vector with only one TRUE value (length of the vector has to be the same as the column count of dataset

The unit stratum variable. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as the column

PSU

Primary sampling unit variable. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as the c

w_final

Weight variable. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as the column count of

ID_household

Variable for household ID codes. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as the

Optional variable for unit ID codes. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as

period

Optional variable for the survey periods. If supplied, the values for each period are computed independently. Object convertible to data.frame or variable names as character, column numbers or logical vector (length of the vector has to be th

Dom

Optional variables used to define population domains. If supplied, values are calculated for each domain. An object convertible to data.frame or variable names as character vector, column numbers or logical vector (length of the vector has to

N_h

optional; either a matrix giving the first column - stratum, but the second column - the total of the population in each stratum.

fh_zero

by default FALSE; fh is calculated as division of n_h and N_h in each strata, if true, fh value is zero in each strata.

PSU_level

by default TRUE; if PSU_level is true, in each strata fh is calculated as division of count of PSU in sample (n_h) and count of PSU in frame (N_h). if PSU_level is false, in each strata fh is calculated as division of count of units in sample (n_h) and co

Optional variables of denominator for ratio estimation. Object convertible to data.frame or variable names as character, column numbers or logical vector (length of the vector has to be the same as the column count of dataset).

dataset

Optional survey data object convertible to data.frame.

Optional matrix of the auxiliary variables for the calibration estimator. Object convertible to data.frame or variable names as character, column numbers or logical vector (length of the vector has to be the same as the column count of

periodX

Optional variable ofthe survey periods. If supplied, residual estimation of calibration is done independently for each time period. Object convertible to data.frame or variable names as character, column numbers or logical vector (length of t

X_ID_household

ind_gr

Optional variable by which divided independently auxiliary variables. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of

Optional variable of the g weights. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as t

datasetX

Optional survey data object in household level convertible to data.frame.

Variable of the positive values accounting for heteroscedasticity. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the

confidence

Optional positive value for confidence interval. This variable by default is 0.95.

outp_lin

Logical value. If TRUE linearized values of the ratio estimator will be printed out.

outp_res

Logical value. If TRUE estimated residuals of calibration will be printed out.

Value

A list with objects are returned by the function:
lin_outA data.table containing the linearized values of the ratio estimator with id and PSU.
res_outA data.table containing the estimated residuals of calibration with id and PSU.
all_resultA data.table, which containing variables: respondent_count - the count of respondents, pop_size - the estimated size of population, n_nonzero - the count of respondents, who answers are larger than zero, estim - the estimated value, var - the estimated variance, se - the estimated standard error, rse - the estimated relative standard error (coefficient of variation), cv - the estimated relative standard error (coefficient of variation) in percentage, absolute_margin_of_error - the estimated absolute margin of error, relative_margin_of_error - the estimated relative margin of error, CI_lower - the estimated confidence interval lower bound, CI_upper - the estimated confidence interval upper bound, var_srs_HT - the estimated variance of the HT estimator under SRS, var_cur_HT - the estimated variance of the HT estimator under current design, var_srs_ca - the estimated variance of the calibrated estimator under SRS, deff_sam - the estimated design effect of sample design, deff_est - the estimated design effect of estimator, deff - the overall estimated design effect of sample design and estimator, n_eff - the effective sample size.

Details

Calculate variance estimation in domains for household surveys based on book of Hansen, Hurwitz and Madow.

References

Morris H. Hansen, William N. Hurwitz, William G. Madow, (1953), Sample survey methods and theory Volume I Methods and applications, 257-258, Wiley. Guillaume Osier and Emilio Di Meglio. The linearisation approach implemented by Eurostat for the first wave of EU-SILC: what could be done from the second wave onwards? 2012 Guillaume Osier, Yves Berger, Tim Goedeme, (2013), Standard error estimation for the EU-SILC indicators of poverty and social exclusion, Eurostat Methodologies and Working papers, URL http://ec.europa.eu/eurostat/documents/3888793/5855973/KS-RA-13-024-EN.PDF. Eurostat Methodologies and Working papers, Handbook on precision requirements and variance estimation for ESS household surveys, 2013, URL http://ec.europa.eu/eurostat/documents/3859598/5927001/KS-RA-13-029-EN.PDF. Yves G. Berger, Tim Goedeme, Guillame Osier (2013). Handbook on standard error estimation and other related sampling issues in EU-SILC, URL http://www.cros-portal.eu/content/handbook-standard-error-estimation-and-other-related-sampling-issues-ver-29072013 Jean-Claude Deville (1999). Variance estimation for complex statistics and estimators: linearization and residual techniques. Survey Methodology, 25, 193-203, URL http://www5.statcan.gc.ca/bsolc/olc-cel/olc-cel?lang=eng&catno=12-001-X19990024882.

Examples

Run this code

data(eusilc)
dataset <- data.frame(1:nrow(eusilc),eusilc)
colnames(dataset)[1] <- "IDd"

aa<-vardomh(Y="eqIncome", H="db040", PSU="db030", w_final="rb050",
           ID_household="db030", id="rb030", Dom = "db040",
           period = NULL, N_h = NULL, Z = NULL, dataset = dataset, X = NULL,
           X_ID_household = NULL, g = NULL, datasetX = NULL,
           q = rep(1, if (is.null(datasetX)) 
                       nrow(as.data.frame(H)) else nrow(datasetX)),
           confidence = .95, outp_lin=TRUE, outp_res=TRUE)

dataset2 <- copy(dataset)
dataset$period=1
dataset2$period=2
dataset <-data.frame(rbind(dataset, dataset2))
aa2<-vardomh(Y="eqIncome", H="db040", PSU="db030", w_final="rb050",
           ID_household="db030", id="rb030", Dom = "db040",
           period = "period", N_h = NULL, Z = NULL, dataset = dataset,
           X = NULL, X_ID_household = NULL, 
           g = NULL, datasetX = NULL,
           q = rep(1, if (is.null(datasetX)) 
                       nrow(as.data.frame(H)) else nrow(datasetX)),
           confidence = .95, outp_lin=TRUE, outp_res=TRUE)

Run the code above in your browser using DataLab