vardchanges: Variance estimation for measures of change for single and multistage stage cluster sampling designs

Description

Computes the variance estimation for measures of change for single and multistage stage cluster sampling designs.

Usage

vardchanges(Y, H, PSU, w_final, id,
            Dom = NULL, Z = NULL,
            country, periods, dataset = NULL,
            period1, period2,
            linratio = FALSE,
            use.estVar = FALSE,
            confidence=0.95)

Arguments

Variables of interest. Object convertible to data.frame or variable names as character, column numbers or logical vector with only one TRUE value (length of the vector has to be the same as the column count of dataset

The unit stratum variable. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as the column

PSU

Primary sampling unit variable. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as the c

w_final

Weight variable. One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (length of the vector has to be the same as the column count of

variable for unit ID codes (for household surveys - secondary unit id number). One dimensional object convertible to one-column data.frame or variable name as character, column number or logical vector with only one TRUE value (l

Dom

Optional variables used to define population domains. If supplied, variables are calculated for each domain. An object convertible to data.frame or variable names as character vector, column numbers or logical vector (length of the vector has

Optional variables of denominator for ratio estimation. If supplied, the ratio estimation is computed. Object convertible to data.frame or variable names as character, column numbers or logical vector (length of the vector has to be the same

country

Variable for the survey countries. The values for each country are computed independently. Object convertible to data.frame or variable names as character, column numbers or logical vector (length of the vector has to be the same as the colum

periods

Variable for the all survey periods. The values for each period are computed independently. Object convertible to data.frame or variable names as character, column numbers or logical vector (length of the vector has to be the same as the colu

dataset

Optional survey data object convertible to data.frame.

period1

The vector of the one row from variable periods describes the first period.

period2

The vector of the one row from variable periods describes the second period.

linratio

Logical value. If value is TRUE, then the linearized variables for the ratio estimator is used for variance estimation. If value is FALSE, then the gradients is used for variance estimation.

use.estVar

Logical value. If value is TRUE, then R function estVar is used for the estimation of covariance matrix of the residuals. If value is FALSE, then R function estVar is not used

confidence

optional; either a positive value for confidence interval. This variable by default is 0.95 .

Value

A list with objects are returned by the function:
crossectional_resultsA data.table containing: period - survey periods, country - survey countries, sample_size - the sample size (in numbers of individuals), pop_size - the population size (in numbers of individuals), total - the estimated totals, variance - the estimated variance of cross-sectional or longitudinal measures, sd_w - the estimated weighted variance of simple random sample, sd_nw - the estimated variance estimation of simple random sample, pop - the population size (in numbers of households), sampl_siz - the sample size (in numbers of households), stderr_w - the estimated weighted standard error of simple random sample, stderr_nw - the estimated standard error of simple random sample, se - the estimated standard error of cross-sectional or longitudinal, rse - the estimated relative standard error (coefficient of variation), cv - the estimated relative standard error (coefficient of variation) in percentage, absolute_margin_of_error - the estimated absolute margin of error, relative_margin_of_error - the estimated relative margin of error, CI_lower - the estimated confidence interval lower bound, CI_upper - the estimated confidence interval upper bound.
changes_resultsA data.table containing estim - the estimated value, var - the estimated variance, se - the estimated standard error, rse - the estimated relative standard error (coefficient of variation), cv - the estimated relative standard error (coefficient of variation) in percentage, absolute_margin_of_error - the estimated absolute margin of error, relative_margin_of_error - the estimated relative margin of error, CI_lower - the estimated confidence interval lower bound, CI_upper - the estimated confidence interval upper bound.

References

Guillaume Osier, Yves Berger, Tim Goedeme, (2013), Standard error estimation for the EU-SILC indicators of poverty and social exclusion, Eurostat Methodologies and Working papers, URL http://ec.europa.eu/eurostat/documents/3888793/5855973/KS-RA-13-024-EN.PDF. Eurostat Methodologies and Working papers, Handbook on precision requirements and variance estimation for ESS household surveys, 2013, URL http://ec.europa.eu/eurostat/documents/3859598/5927001/KS-RA-13-029-EN.PDF. Yves G. Berger, Tim Goedeme, Guillame Osier (2013). Handbook on standard error estimation and other related sampling issues in EU-SILC, URL http://www.cros-portal.eu/content/handbook-standard-error-estimation-and-other-related-sampling-issues-ver-29072013

Examples

Run this code

### Example 

data("eusilc")
set.seed(1)
eusilc1 <- eusilc[1:40,]

set.seed(1)
data <- data.table(rbind(eusilc1, eusilc1),
                   year=c(rep(2010, nrow(eusilc1)),
                          rep(2011, nrow(eusilc1))),
                   country=c(rep("AT", nrow(eusilc1)),
                             rep("AT", nrow(eusilc1))))
data[age<0, age:=0]
PSU <- data[,.N, keyby="db030"]
PSU[, N:=NULL]
PSU[, PSU:=trunc(runif(nrow(PSU), 0, 5))]
setkeyv(PSU, "db030")
setkeyv(data, "db030")
data <- merge(data, PSU, all=TRUE)
PSU <- eusilc <- NULL
data[, strata:=c("XXXX")]
data[, strata:=as.character(strata)]

data[, t_pov:=trunc(runif(nrow(data), 0, 2))]
data[, exp:= 1]

# At-risk-of-poverty (AROP)
data[, pov:= ifelse (t_pov == 1, 1, 0)]
 
result <- vardchanges(Y="pov",
                   H="strata", PSU="PSU", w_final="rb050",
                   id="db030", Dom=NULL, Z=NULL,
                   country="country", periods="year",
                   dataset=data,
                   period1=2010, period2=2011,
                   use.estVar=FALSE)
 
data <- data.table(rbind(eusilc, eusilc),
                   year=c(rep(2010, nrow(eusilc)),
                          rep(2011, nrow(eusilc))),
                   country=c(rep("AT", nrow(eusilc)),
                             rep("AT", nrow(eusilc))))
data[age<0, age:=0]
PSU <- data[,.N, keyby="db030"]
PSU[, N:=NULL]
PSU[, PSU:=trunc(runif(nrow(PSU), 0, 100))]
setkeyv(PSU, "db030")
setkeyv(data, "db030")
data <- merge(data, PSU, all=TRUE)
PSU <- eusilc <- NULL
data[, strata:=c("XXXX")]
data[, strata:=as.character(strata)]

data[, t_pov:=trunc(runif(nrow(data), 0, 2))]
data[, t_dep:=trunc(runif(nrow(data), 0, 2))]
data[, t_lwi:=trunc(runif(nrow(data), 0, 2))]
data[, exp:= 1]
data[, exp2:= 1 * (age < 60)]

# At-risk-of-poverty (AROP)
data[, pov:= ifelse (t_pov == 1, 1, 0)]
 
# Severe material deprivation (DEP)
data[, dep:= ifelse (t_dep == 1, 1, 0)]

# Low work intensity (LWI)
data[, lwi:= ifelse (t_lwi == 1 & exp2 == 1, 1, 0)]

# At-risk-of-poverty or social exclusion (AROPE)
data[, arope:= ifelse (pov == 1 | dep == 1 | lwi == 1, 1, 0)]
data[, dom:=1]

result <- vardchanges(Y=c("pov", "dep", "lwi", "arope"),
                   H="strata", PSU="PSU", w_final="rb050",
                   id="db030", Dom="rb090", Z=NULL,
                   country="country", periods="year",
                   dataset=data,
                   period1=2010, period2=2011,
                   use.estVar=FALSE)

Run the code above in your browser using DataLab