quokar (version 0.1.0)

frame_fn_obs: Visualization of quantile regression model fitting: interior point algorithm

Description

observations used in quantile regression fitting

$$min_{b \in R^{p}}\sum_{i=1}^{n}\rho_{\tau}(y_i-x_{i}^{'}b)$$

where \(\rho_{\tau}(r)=r[\tau-I(r<0)]$ for $\tau \in (0,1)\). This yields the modified linear program

$$min(\tau e^{'}u+(1-\tau)e^{'}v|y=Xb+u-v,(u,v) \in R_{+}^{2n})$$

Adding slack variables, \(s\), satisfying the constrains \(a+s=e\), we obtain the barrier function

$$B(a, s, u) = y^{'}a+\mu \sum_{i=1}^{n}(loga_{i}+logs_{i})$$

which should be maximized subject to the constrains \(X^{'}a=(1-\tau)X^{'}e\) and \(a+s=e\). The Newton step \(\delta_{a}\) solving

$$max{y^{'}\delta_{a}+\mu \delta^{'}_{a}(A^{-1}-S^{-1})e- \frac{1}{2}\mu \delta_{a}^{'}(A^{-2}+S^{-2})\delta_{a}}$$

subject to \(X{'}\delta_{a}=0\), satisfies

$$y+\mu(A^{-1}-S^{-1})e-\mu(A^{-2}+S^{-2})\delta_{a}=Xb$$

for some \(b\in R^{p}\), and \(\delta_{a}\) such that \(X^{'}\delta_{a}=0\). Using the constraint, we can solve explicitly for the vector \(b\),

$$b=(X^{'}WX)^{-1}X^{'}W[y+\mu(A^{-1}-S^{-1})e]$$

where \(W=(A^{-2}+S^{-2})^{-1}\). This is a form of the primal log barrier algorithm described above. Setting \(\mu=0\) in each step yields an affine scaling variant of the algorithm. The basic linear algebra of each iteration is essentially unchanged, only the form of the diagonal weighting matrix \(W\) has chagned.

Usage

frame_fn_obs(object, tau)

Arguments

object

quantile regression model using interior point method for estimating

tau

quantile

Value

Weighted observations in quantile regression fitting using interior point algorithm

Details

This function used to illustrate data used in fitting process of quantile regression based on interior point method. Koenker and Bassett(1978) introduced asymmetric weight on positive and negative residuals, and solves the slightly modified l1-problem.

References

Portnoy S, Koenker R. The Gaussian hare and the Laplacian tortoise: computability of squared-error versus absolute-error estimators. Statistical Science, 1997, 12(4): 279-300.

Examples

Run this code
# NOT RUN {
library(ggplot2)
library(quantreg)
library(tidyr)
library(dplyr)
library(gridExtra)
data(ais)
tau <- c(0.1, 0.5, 0.9)
object <- rq(BMI ~ LBM + Ht, data = ais, tau = tau, method = 'fn')
fn <- frame_fn_obs(object, tau)
##For tau = 0.1, plot the observations used in quantile regression
##fitting based on interior point method
fn1 <- fn[ ,1]
case <- 1:length(fn1)
fn1 <- cbind(case, fn1)
m <- data.frame(y = ais$BMI, x1 = ais$LBM, x2 = ais$Ht, fn1)
p <- length(attr(object$coefficients, "dimnames")[[1]])
m_f <- m %>% gather(variable, value, -case, -fn1, -y)
mf_a <- m_f %>%
 group_by(variable) %>%
 arrange(variable, desc(fn1)) %>%
 filter(row_number() %in% 1:p )
p1 <- ggplot(m_f, aes(x = value, y = y)) +
 geom_point(alpha = 0.1) +
 geom_point(data = mf_a, size = 3) +
 facet_wrap(~variable, scale = "free_x")
## For tau = 0.5, plot the observations used in quantile regression
##fitting based on interior point method
fn2 <- fn[,2]
case <- 1: length(fn2)
fn2 <- cbind(case, fn2)
m <- data.frame(y = ais$BMI, x1 = ais$LBM, x2 = ais$Ht, fn2)
p <- length(attr(object$coefficients, "dimnames")[[1]])
m_f <- m %>% gather(variable, value, -case, -fn2, -y)
mf_a <- m_f %>%
   group_by(variable) %>%
   arrange(variable, desc(fn2)) %>%
   filter(row_number() %in% 1:p )
p2 <- ggplot(m_f, aes(x = value, y = y)) +
   geom_point(alpha = 0.1) +
   geom_point(data = mf_a, size = 3) +
   facet_wrap(~variable, scale = "free_x")
## For tau = 0.9
fn3 <- fn[,3]
case <- 1: length(fn3)
fn3 <- cbind(case, fn3)
m <- data.frame(y = ais$BMI, x1 = ais$LBM, x2 = ais$Ht, fn3)
p <- length(attr(object$coefficients, "dimnames")[[1]])
m_f <- m %>% gather(variable, value, -case, -fn3, -y)
mf_a <- m_f %>%
  group_by(variable) %>%
  arrange(variable, desc(fn3)) %>%
  filter(row_number() %in% 1:p )
p3 <- ggplot(m_f, aes(x = value, y = y)) +
  geom_point(alpha = 0.1) +
  geom_point(data = mf_a, size = 3) +
  facet_wrap(~variable, scale = "free_x")
grid.arrange(p1, p2, p3, ncol = 1)
# }

Run the code above in your browser using DataLab