loc_est: Local linear frontier estimator

Description

Computes the local linear smoothing frontier estimator of Hall, Park and Stern (1998) and Hall and Park (2004).

Usage

loc_est(xtab, ytab, x, h, method="u", control = list("tm_limit" = 700))

Arguments

xtab

a numeric vector containing the observed inputs $x_1,\ldots,x_n$.

ytab

a numeric vector of the same length as xtab containing the observed outputs $y_1,\ldots,y_n$.

a numeric vector of evaluation points in which the estimator is to be computed.

determines the bandwidth at which the local linear estimate will be computed.

method

a character equal to "u" (unconstrained estimator) or "m" (improved version of the unconstrained estimator).

control

a list of parameters to the GLPK solver. See *Details* of help(Rglpk_solve_LP).

Value

Returns a numeric vector with the same length as x. Returns a vector of NA if no solution has been found by the solver (GLPK).

Details

In the unconstrained case (option method="u"), the implemented estimator of $\varphi(x)$ is defined by $$ \hat \varphi_{n,LL}(x) = \min \Big\{ z : {\rm there~exists~} \theta ~{\rm such~that~} y_i \leq z + \theta(x_i - x)$$ $${\rm for~all}~i~{\rm such~that~}x_i \in (x-h,x+h) \Big\},$$ where the bandwidth $h$ has to be fixed by the user in the 4th argument of the function. This estimator may lack of smoothness in case of small samples and has no guarantee of being monotone even if the true frontier is so. Following the curvature of the monotone frontier $\varphi$, the unconstrained estimator $\hat \varphi_{n,LL}$ is likely to exhibit substantial bias, especially at the sample boundaries (see Daouia et al (2016) for numerical illustrations). A simple way to remedy to this drawback is by imposing the extra condition $\theta \geq 0$ in the definition of $\hat \varphi_{n,LL}(x)$ to get $$ \tilde \varphi_{n,LL}(x) = \min \Big\{ z : {\rm there~exists~} \theta\geq 0 ~{\rm such~that~} y_i \leq z + \theta(x_i - x)$$ $${\rm for~all}~i~{\rm such~that~}x_i \in (x-h,x+h) \Big\}.$$ As shown in Daouia et al (2016), this version only reduces the vexing bias and border defects of the original estimator when the true frontier is monotone. The option method="m" indicates that the improved fit $\tilde \varphi_{n,LL}(x)$ should be utilized in place of $\hat \varphi_{n,LL}(x)$. Hall and Park (2004) proposed a bootstrap procedure for selecting the optimal bandwidth $h$ in $\hat \varphi_{n,LL}(x)$ and $\tilde \varphi_{n,LL}(x)$ (see the function loc_est_bw).

References

Daouia, A., Noh, H. and Park, B.U. (2016). Data Envelope fitting with constrained polynomial splines. Journal of the Royal Statistical Society: Series B, 78(1), 3-30. doi:10.1111/rssb.12098.

Hall, P. and Park, B.U. (2004). Bandwidth choice for local polynomial estimation of smooth boundaries. Journal of Multivariate Analysis, 91, 240-261.

Hall, P., Park, B.U. and Stern, S.E. (1998). On polynomial estimators of frontiers and boundaries. Journal of Multivariate Analysis, 66, 71-98.

Examples

Run this code

# NOT RUN {
data("nuclear")
x.nucl <- seq(min(nuclear$xtab), max(nuclear$xtab), 
 length.out=101) 
# 1. Unconstrained estimator
# Optimal bandwidths over 100 bootstrap replications
# }
# NOT RUN {
h.nucl.u <- loc_est_bw(nuclear$xtab, nuclear$ytab, 
 x.nucl, h=40, B=100, method="u")
# }
# NOT RUN {
(h.nucl.u<-79.11877)
y.nucl.u<-loc_est(nuclear$xtab, nuclear$ytab, x.nucl, 
 h=h.nucl.u, method="u")

# 2. improved version of the estimator
# Optimal bandwidths over 100 bootstrap replications
# }
# NOT RUN {
h.nucl.m <- loc_est_bw(nuclear$xtab, nuclear$ytab, 
 x.nucl, h=40, B=100, method="m")
# }
# NOT RUN {
(h.nucl.m<-79.12)
y.nucl.m<-loc_est(nuclear$xtab, nuclear$ytab, x.nucl, 
 h=h.nucl.m, method="m") 

# 3. Representation 
plot(x.nucl, y.nucl.u, lty=1, lwd=4, col="magenta", type="l")
lines(x.nucl, y.nucl.m, lty=2, lwd=4, col="cyan") 
points(ytab~xtab, data=nuclear)
legend("topleft",legend=c("unconstrained", "improved"), 
 col=c("magenta","cyan"), lwd=4, lty=c(1,2))
# }

Run the code above in your browser using DataLab