Learn R Programming

vardpoor (version 0.2.0.8.1)

varpoord: Estimation of the variance and deff for sample surveys for indicators on social exclusion and poverty

Description

Computes the estimation of the variance for indicators on social exclusion and poverty.

Usage

varpoord(inc, w_final, income_thres = NULL, wght_thres = NULL,
                 ID_household, id = NULL, H, PSU, N_h, fh_zero = FALSE,
                 PSU_level=TRUE, sort = NULL, Dom = NULL, period = NULL,
                 gender = NULL, dataset = NULL, X = NULL, periodX = NULL,
                 X_ID_household = NULL, ind_gr = NULL, g = NULL, datasetX = NULL,
                 q, percentage = 60, order_quant = 50, alpha = 20,
                 confidence = 0.95, outp_lin = FALSE, outp_res = FALSE,
                 na.rm = FALSE, several.ok = FALSE, type = "lin_rmpg")

Arguments

inc
either a numeric vector, 1 column data.frame, matrix, data.table giving the equivalized disposable income, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset
w_final
optional; either a numeric vector, 1 column data.frame, matrix, data.table giving the personal sample weights, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'da
income_thres
either a numeric vector, 1 column data.frame, matrix, data.table giving the equivalized disposable income for computation and linearization of the poverty threshold, or (if dataset is not NULL) a character string, an
wght_thres
either a numeric vector, 1 column data.frame, matrix, data.table giving the personal sample weights for computation and linearization of the poverty threshold, or (if dataset is not NULL) a character string, an intege
ID_household
either 1 column data.frame, matrix, data.table with column names giving the household IDs, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset' column count)
id
optional; either 1 column data.frame, matrix, data.table with column names giving the personal IDs, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset' colum
H
either 1 column data.frame, matrix, data.table with column name giving elements indicating the unit stratum, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset
PSU
either 1 column data.frame, matrix, data.table giving primary sampling unit, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset' column count) specifying the co
N_h
either a matrix giving the first column - stratum, but the second column - the total of the population in each stratum.
fh_zero
by default FALSE; fh is calculated as division of n_h and N_h in each strata, if true, fh value is zero in each strata.
PSU_level
by default TRUE; if PSU_level is true, in each strata fh is calculated as division of count of PSU in sample (n_h) and count of PSU in frame(N_h). if PSU_level is false, in each strata fh is calculated as division of count of units in sample (n_
sort
optional; either a numeric vector, 1 column data.frame, matrix, data.table giving the personal IDs to be used as tie-breakers for sorting, or (if dataset is not NULL) a character string, an integer or a logical vector
Dom
optional; either a data.frame, matrix, data.table with column names giving different domains, or (if dataset is not NULL) character strings, integers or a logical vectors (length is the same as 'dataset' column count) sp
period
optional; either a data.frame, matrix, data.table with column names giving different periods, or (if dataset is not NULL) character strings, integers or a logical vectors (length is the same as 'dataset' column coun
gender
either a factor giving the gender, or (if dataset is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset' column count) specifying the corresponding column of dataset
dataset
an optional; name of the individual dataset data.frame.
X
optional; either a data.frame, matrix, data.table giving auxiliary variables, or (if datasetX is not NULL) character strings, integers or a logical vectors (length is the same as 'dataset' column count) specifying the co
periodX
optional; either a data.frame, matrix, data.table with column names giving different periods for data X, or (if datasetX is not NULL) character strings, integers or a logical vectors (length is the same as 'dataset'
X_ID_household
either 1 column data.frame, matrix, data.table with column name giving the household IDs for auxiliary variables, or (if datasetX is not NULL) a character string, an integer or a logical vector (length is the same as
ind_gr
optional; either a vector, 1 column data.frame, matrix, data.table giving the variable by which divided independently auxiliary variables, or (if datasetX is not NULL) a character string, an integer or a logical vector
g
optional; either a numeric vector, 1 column data.frame, matrix, data.table giving the g weights, or (if datasetX is not NULL) a character string, an integer or a logical vector (length is the same as 'dataset' column co
datasetX
an optional; name of the individual dataset data.frame.
q
optional; either a numeric vector, 1 column data.frame, matrix, data.table giving the positive values accounting for heteroscedasticity, or (if datasetX is not NULL) a character string, an integer or a logical vector (l
percentage
a numeric value in $[0,100]$ giving the percentage of the income quantile to be used for the at-risk-of-poverty threshold (see linarpt).
order_quant
a numeric value in $[0,100]$ giving the order of the income quintale (in percentage) to be used for the at-risk-of-poverty threshold (see linarpt).
alpha
a numeric value in $[0,100]$ giving the Order of the income quantile share ratio (in percentage).
confidence
optional; either a positive value for confidence interval. This variable by default is 0.95.
outp_lin
logical. if TRUE linearized values will be printed out
outp_res
logical. if TRUE estimated residuals of calibration will be printed out
na.rm
a logical indicating whether missing values should be removed.
several.ok
logical specifying if type should be allowed to have more than one.
type
a character vector (of length one unless several.ok is TRUE), example "linarpr","linarpt", "lingpg", "linpoormed", "linrmpg", "lingini", "lingini2", "linqsr", "all_choises".

Value

  • The function returns values:
  • estima data.frame containing the estimation(s) by domain, or (if Dom is NULL) totals.
  • vara matrix containing the values of the variance estimation by domains or (if Dom is NULL) totals.
  • sea matrix containing the values of the standart error by domains or (if Dom is NULL) totals.
  • rsea data.frame containing the values of the relative standart error (coefficient of variation) by domains or (if Dom is NULL) totals in percentage.
  • cva data.frame containing the values of the relative standart error (coefficient of variation) by domains or (if Dom is NULL) totals.
  • absolute_margin_of_errora matrix containing the values of the absolute margin of error by domains or (if Dom is NULL) totals.
  • relative_margin_of_errora matrix containing the values of the relative margin of error by domains or (if Dom is NULL) totals.
  • CI_lowera data.frame containing the values of the confidence interval lower bound by domains or (if Dom is NULL) totals.
  • CI_uppera data.frame containing the values of the confidence interval upper bound by domains or (if Dom is NULL) totals.
  • var_srs_HTa matrix containing the values of the variance estimation of HT estimator under SRS by domains or (if Dom is NULL) totals.
  • var_cur_HTa matrix containing the values of the variance estimation of HT estimator under HT estimator under current design by domains or (if Dom is NULL) totals.
  • var_srs_caa matrix containing the values of the variance estimation of calibrated estimator under SRS by domains or (if Dom is NULL) totals.
  • deff_sama matrix containing the values of the estimation of the design effect of sample design by domains or (if Dom is NULL) totals.
  • deff_esta matrix containing the values of the estimation of the design effect of estimator by domains or (if Dom is NULL) totals.
  • deffa matrix containing the values of the estimation of the overall design effect of sample design and estimator by domains or (if Dom is NULL) totals.
  • lin_outa data.table containing the linearized values with ID_household and id.
  • res_outa data.table containing the estimated residuals of calibration with id and PSU.
  • all_resulta data.frame containing all previosly definited values together by domains or (if Dom is NULL) totals.

References

Yves G. Berger, Tim Goedeme, Guillame Osier (2013). Handbook on standard error estimation and other related sampling issues in EU-SILC, URL http://www.cros-portal.eu/content/handbook-standard-error-estimation-and-other-related-sampling-issues-ver-29072013 Working group on Statistics on Income and Living Conditions (2004) Common cross-sectional EU indicators based on EU-SILC; the gender pay gap. EU-SILC 131-rev/04, Eurostat. Guillaume Osier (2009). Variance estimation for complex indicators of poverty and inequality. Journal of the European Survey Research Association, Vol.3, No.3, pp. 167-195, ISSN 1864-3361, URL https://ojs.ub.uni-konstanz.de/srm/article/view/369. MATTI LANGEL - YVES TILLE, Corrado Gini, a pioneer in balanced sampling and inequality theory. METRON - International Journal of Statistics, 2011, vol. LXIX, n. 1, pp. 45-65, URL ftp://metron.sta.uniroma1.it/RePEc/articoli/2011-1-3.pdf. Deville, J. C. (1999). Variance estimation for complex statistics and estimators: linearization and residual techniques. Survey Methodology, 25, 193-203, URL http://www5.statcan.gc.ca/bsolc/olc-cel/olc-cel?lang=eng&catno=12-001-X19990024882.

See Also

vardom, vardomh, linarpt

Examples

Run this code
data(eusilc)
dataset <- data.frame(1:nrow(eusilc),eusilc)
colnames(dataset)[1] <- "IDd"

aa<-varpoord("eqIncome", "rb050", income_thres = NULL,
             wght_thres = NULL, ID_household = "db030",
             id = NULL, H="db040",
             PSU="rb030", N_h=NULL, sort = NULL,
             Dom = "db040", gender = NULL, X = NULL,
             X_ID_household = NULL,
             g = NULL,
             datasetX = NULL,
             q = rep(1, if (is.null(datasetX)) 
                        nrow(as.data.frame(H)) else nrow(datasetX)),
             dataset =  dataset, percentage=60, order_quant=50,
             alpha = 20, confidence = .95, outp_lin = TRUE,
             outp_res = TRUE, na.rm=FALSE,
             several.ok=FALSE, type="lingini")
aa$lin_out[20:40]
aa$res_out[20:40]

Run the code above in your browser using DataLab