dfs_momt: Moment frontier estimator

Description

This function is an implementation of the moment-type estimator developed by Daouia, Florens and Simar (2010).

Usage

dfs_momt(xtab, ytab, x, rho, k, ci=TRUE)

Arguments

xtab

a numeric vector containing the observed inputs $x_1,\ldots,x_n$.

ytab

a numeric vector of the same length as xtab containing the observed outputs $y_1,\ldots,y_n$.

a numeric vector of evaluation points in which the estimator is to be computed.

rho

a numeric vector of the same length as x or a scalar, which determines the values of rho.

a numeric vector of the same length as x or a scalar, which determines the thresholds at which the moment estimator will be computed.

a boolean, TRUE for computing the confidence interval.

Value

Returns a numeric vector with the same length as x.

Details

Combining ideas from Dekkers, Einmahl and de Haan (1989) with the dimensionless transformation $\{z^{x}_i := y_i\mathbf{1}_{\{x_i\le x\}}, \,i=1,\cdots,n\}$ of the observed sample $\{(x_i,y_i), \,i=1,\cdots,n\}$, the authors estimate the conditional endpoint $\varphi(x)$ by $$\tilde\varphi_{momt}(x) = z^{x}_{(n-k)} + z^{x}_{(n-k)} M^{(1)}_n \left\{1 + \rho_x \right\}$$ where $M^{(1)}_n = (1/k)\sum_{i=0}^{k-1}\left(\log z^x_{(n-i)}- \log z^x_{(n-k)}\right)$, $z^{x}_{(1)}\leq \cdots\leq z^{x}_{(n)}$ are the ascending order statistics corresponding to the transformed sample $\{z^{x}_i, \,i=1,\cdots,n\}$ and $\rho_x>0$ is referred to as the extreme-value index and has the following interpretation: when $\rho_x>2$, the joint density of data decays smoothly to zero at a speed of power $\rho_x -2$ of the distance from the frontier; when $\rho_x=2$, the density has sudden jumps at the frontier; when $\rho_x<2$, the density increases toward infinity at a speed of power $\rho_x -2$ of the distance from the frontier. Most of the contributions to econometric literature on frontier analysis assume that the joint density is strictly positive at its support boundary, or equivalently, $\rho_x=2$ for all $x$. When $\rho_x$ is unknown, Daouia et al. (2010) suggest to use the following two-step estimator: First, estimate $\rho_x$ by the moment estimator $\tilde\rho_x$ implemented in the function rho_momt_pick by utilizing the option method="moment", or by the Pickands estimator $\hat\rho_x$ by using the option method="pickands". Second, use the estimator $\tilde\varphi_{momt}(x)$, as if $\rho_x$ were known, by substituting the estimated value $\tilde\rho_x$ or $\hat\rho_x$ in place of $\rho_x$. The $95\%$ confidence interval of $\varphi(x)$ derived from the asymptotic normality of $\tilde\varphi_{momt}(x)$ is given by $$[\tilde\varphi_{momt}(x) \pm 1.96 \sqrt{V(\rho_x) / k} z^{x}_{(n-k)} M^{(1)}_n (1 + 1/\rho_x) ]$$ where $V(\rho_x) = \rho^2_x (1+2/\rho_x)^{-1}$. The sample fraction $k=k_n(x)$ plays here the role of the smoothing parameter and varies between 1 and $N_x-1$, with $N_x=\sum_{i=1}^n\mathbf{1}_{\{x_i\le x\}}$ being the number of observations $(x_i,y_i)$ with $x_i \leq x$. See kopt_momt_pick for an automatic data-driven rule for selecting $k$.

References

Daouia, A., Florens, J.P. and Simar, L. (2010). Frontier Estimation and Extreme Value Theory, Bernoulli, 16, 1039-1063.

Dekkers, A.L.M., Einmahl, J.H.J. and L. de Haan (1989), A moment estimator for the index of an extreme-value distribution, nnals of Statistics, 17, 1833-1855.

Examples

Run this code

# NOT RUN {
data("post")
x.post <- seq(post$xinput[100], max(post$xinput), 
 length.out = 100) 
# 1. When rho[x] is known and equal to 2, we set:
rho <- 2
# To determine the sample fraction k=k[n](x) 
# in tilde(varphi[momt])(x).
best_kn.1 <- kopt_momt_pick(post$xinput, post$yprod, 
 x.post, rho = rho)
# To compute the frontier estimates and confidence intervals:  
res.momt.1 <- dfs_momt(post$xinput, post$yprod, x.post, 
 rho = rho, k = best_kn.1)
# Representation
plot(yprod~xinput, data = post, xlab = "Quantity of labor", 
 ylab = "Volume of delivered mail")
lines(x.post, res.momt.1[,1], lty = 1, col = "cyan")  
lines(x.post, res.momt.1[,2], lty = 3, col = "magenta")  
lines(x.post, res.momt.1[,3], lty = 3, col = "magenta")  

# }
# NOT RUN {
# 2. rho[x] is unknown and estimated by 
# the Pickands estimator tilde(rho[x])
rho_momt <- rho_momt_pick(post$xinput, post$yprod, 
 x.post)
best_kn.2 <- kopt_momt_pick(post$xinput, post$yprod,
  x.post, rho = rho_momt)
res.momt.2 <- dfs_momt(post$xinput, post$yprod, x.post, 
 rho = rho_momt, k = best_kn.2)  
# 3. rho[x] is unknown independent of x and estimated
# by the (trimmed) mean of tilde(rho[x])
rho_trimmean <- mean(rho_momt, trim=0.00)
best_kn.3 <- kopt_momt_pick(post$xinput, post$yprod,
  x.post, rho = rho_trimmean)   
res.momt.3 <- dfs_momt(post$xinput, post$yprod, x.post, 
 rho = rho_trimmean, k = best_kn.3)  

# Representation 
plot(yprod~xinput, data = post, col = "grey", 
 xlab = "Quantity of labor", ylab = "Volume of delivered mail")
lines(x.post, res.momt.2[,1], lty = 1, lwd = 2, col = "cyan")  
lines(x.post, res.momt.2[,2], lty = 3, lwd = 4, col = "magenta")  
lines(x.post, res.momt.2[,3], lty = 3, lwd = 4, col = "magenta")  
plot(yprod~xinput, data = post, col = "grey", 
 xlab = "Quantity of labor", ylab = "Volume of delivered mail")
lines(x.post, res.momt.3[,1], lty = 1, lwd = 2, col = "cyan")  
lines(x.post, res.momt.3[,2], lty = 3, lwd = 4, col = "magenta")  
lines(x.post, res.momt.3[,3], lty = 3, lwd = 4, col = "magenta") 
# }

Run the code above in your browser using DataLab