bootParFuture: Parametric bootstrap estimators of prediction accuracy - parallel computing.

Description

The function computes values of parametric bootstrap estimators of RMSE and QAPE prediction accuracy measures using parallel computing

Usage

bootParFuture(predictor, B, p)

Value

estQAPE: estimated value/s of QAPE - number of rows is equal to the number of orders of quantiles to be considered (declared in p), number of columns is equal to the number of predicted characteristics (declared in thetaFun).
estRMSE: estimated value/s of RMSE (more than one value is computed if in thetaFun more than one population characteristic is defined).
summary: estimated accuracy measures for the predictor of characteristics defined in thetaFun.
predictorSim: bootstrapped values of the predictor/s.
thetaSim: bootstrapped values of the predicted population or subpopulation characteristic/s.
Ysim: simulated values of the (possibly tranformed) variable of interest.
error: differences between bootstrapped values of the predictor/s and bootstrapped values of the predicted characteristic/s.
positiveDefiniteEstG: logical indicating if the estimated covariance matrix of random effects, used to generate bootstrap realizations of the dependent variable, is positive definite.

Arguments

predictor: one of objects: EBLUP, ebpLMMne or plugInLMM.
B: number of iterations in the bootstrap procedure.
p: orders of quantiles in the QAPE.

Author

Alicja Wolny-Dominiak, Tomasz Zadlo

Details

We use bootstrap model presented by Chatterjee, Lahiri and Li (2008) p. 1229 but assumed for all population elements. Vectors of random effects and random components are generated from the multivariate normal distribution where REML estimates of model parameters are used. Random effects are generated for all population elements even for subsets with zero sample sizes (for which random effects are not estimated). We use the MSE estimator defined as the mean of squared bootstrap errors considered by Rao and Molina (2015) p. 141 and given by equation (6.2.22). The QAPE is a quantile of absolute prediction error which means that at least p100% of realizations of absolute prediction errors are smaller or equal to QAPE. It is estimated as a quantile of absolute bootstrap errors as proposed by Zadlo (2017) in Section 2. The parallel processing is performed via the future.apply package.

References

1. Butar, B. F., Lahiri, P. (2003) On measures of uncertainty of empirical Bayes small-area estimators, Journal of Statistical Planning and Inference, Vol. 112, pp. 63-76.

2. Chatterjee, S., Lahiri, P. Li, H. (2008) Parametric bootstrap approximation to the distribution of EBLUP and related prediction intervals in linear mixed models, Annals of Statistics, Vol. 36 (3), pp. 1221?1245.

3. Rao, J.N.K. and Molina, I. (2015) Small Area Estimation. Second edition, John Wiley & Sons, New Jersey.

4. Zadlo T. (2017), On asymmetry of prediction errors in small area estimation, Statistics in Transition, 18 (3), 413-432.

Examples

Run this code


library(lme4)
library(Matrix)
library(mvtnorm)
library(matrixcalc) 
library(future.apply)


data(invData) 
# data from one period are considered: 
invData2018 <- invData[invData$year == 2018,] 
attach(invData2018)

N <- nrow(invData2018) # population size

con <- rep(1,N) 
con[c(379,380)] <- 0 # last two population elements are not observed 

YS <- log(investments[con == 1]) # log-transformed values
backTrans <- function(x) exp(x) # back-transformation of the variable of interest
fixed.part <- 'log(newly_registered)'
random.part <- '(1|NUTS2)'

reg <- invData2018[, -which(names(invData2018) == 'investments')]
weights <- rep(1,N) # homoscedastic random components

# Characteristics to be predicted:
# values of the variable for last two population elements  
thetaFun <- function(x) {x[c(379,380)]}
set.seed(123)

predictor <- plugInLMM(YS, fixed.part, random.part, reg, con, weights, backTrans, thetaFun)
predictor$thetaP

### Estimation of prediction accuracy
est_accuracy <- bootParFuture(predictor, 10, c(0.75,0.9))

# Estimation of prediction RMSE
est_accuracy$estRMSE

# Estimation of prediction QAPE
est_accuracy$estQAPE

#        [,1]     [,2]
# 75% 1370.823 180.0514
# 90% 1477.444 249.7517

####### Interpretations in case of prediction of investments 
####### for population element no. 379:
### It is estimated that at least 75% of absolute prediction errors are
# smaller or equal 1370.823 milion Polish zloty
# and at least 25% of absolute prediction errors are
# greater or equal 1370.823 milion Polish zloty. 
### It is estimated that at least 90% of absolute prediction errors are
# smaller or equal 1477.444 milion Polish zloty
# and at least 10% of absolute prediction errors are
# greater or equal 1477.444 milion Polish zloty. 

detach(invData2018)

Run the code above in your browser using DataLab