
semip(form,nonpar,conpar,window1=.25,window2=.25,bandwidth1=0,bandwidth2=0, kern="tcub",distance="Mahal",targetfull=NULL, print.summary=TRUE, data=NULL)
The estimation procedure has the following three steps under either specification:
1. Nonparametric regressions of y on z and each X on z using the lwr function when conpar=NULL and the cparlwr function when a list of variables is provided for cparlwr. The window or bandwidth for these regressions is set by window1 or bandwidth1.
2. OLS regression of $y-fitted(y)$ on the k-1 variables in $X - fitted(X)$, omitting the intercept. The coefficients from this regression are the estimated values of $\beta$.
3. Nonparametric regression of $y-X \beta$ on z using the lwr function when conpar=NULL and the cparlwr function when a list of variables is provided for cparlwr. The window or bandwidth for these regressions is set by window2 or bandwidth2.
The stage-two OLS regressions use k degrees of freedom. The stage-three nonparametric regression uses 2*df1-df2 degrees of freedom, where $df1 = tr(L)$ and $df2 = tr(L'L)$ and L is the nxn matrix for the lwr or cparlwr regression $L(Y - X \beta)$. The estimated variance is $sig2 = rss/(n-2*df1+df2)$, where $rss = sum(y-XB-f(z))^2$ . The covariance matrix estimate for $\beta$ is $sig2*((X-fitted(X))'(X-fitted(X)))^(-1).$ The covariance matrix is stored as vmat.
The nonparametric regressions are estimated using either the lwr or cparlwr function. See their descriptions for more information.
Loader, Clive. Local Regression and Likelihood. New York: Springer, 1999.
McMillen, Daniel P., "Issues in Spatial Data Analysis," Journal of Regional Science 50 (2010), 119-141.
McMillen, Daniel P., "Employment Densities, Spatial Autocorrelation, and Subcenters in Large Metropolitan Areas," Journal of Regional Science 44 (2004), 225-243.
McMillen, Daniel P. and Christian Redfearn, "Estimation and Hypothesis Testing for Nonparametric Hedonic House Price Functions," Journal of Regional Science 50 (2010), 712-733.
Pagan, Adrian and Aman Ullah. Nonparametric Econometrics. New York: Cambridge University Press, 1999.
Robinson, Paul M. 1988. "Root-N-Consistent Semiparametric Regression," Econometrica, 56, 931-954.
# Single variable in f(z)
par(ask=TRUE)
n = 1000
x <- runif(n,0,2*pi)
x <- sort(x)
z <- runif(n,0,2*pi)
xsq <- x^2
sinx <- sin(x)
cosx <- cos(x)
sin2x <- sin(2*x)
cos2x <- cos(2*x)
ybase1 <- x - .1*xsq + sinx - cosx - .5*sin2x + .5*cos2x
ybase2 <- -z + .1*(z^2) - sin(z) + cos(z) + .5*sin(2*z) - .5*cos(2*z)
ybase <- ybase1+ybase2
sig = sd(ybase)/2
y <- ybase + rnorm(n,0,sig)
# Correct specification for x; z in f(z)
fit <- semip(y~x+xsq+sinx+cosx+sin2x+cos2x,nonpar=~z,window1=.20,window2=.20)
2*fit$df1 - fit$df2
yvect <- c(min(ybase1,fit$xbhat), max(ybase1, fit$xbhat))
xbhat <- fit$xbhat - mean(fit$xbhat) + mean(ybase1)
plot(x,ybase1,type="l",xlab="x",ylab="ybase1",ylim=yvect, main="Predictions for XB")
lines(x, xbhat, col="red")
predse <- sqrt(fit$sig2 + fit$nphat.se^2)
nphat <- fit$nphat - mean(fit$nphat) + mean(ybase2)
lower <- nphat + qnorm(.025)*fit$nphat.se
upper <- nphat + qnorm(.975)*fit$nphat.se
o <- order(z)
yvect <- c(min(lower), max(upper))
plot(z[o], ybase2[o], type="l", xlab="z", ylab="f(z) ",
main="Predictions for f(z) ", ylim=yvect)
lines(z[o], nphat[o], col="red")
lines(z[o], lower[o], col="red", lty="dashed")
lines(z[o], upper[o], col="red", lty="dashed")
## Not run:
# # Chicago Housing Sales
# data(matchdata)
# match05 <- data.frame(matchdata[matchdata$year==2005,])
# match05$age <- 2005-match05$yrbuilt
#
# tfit1 <- maketarget(~dcbd,window=.3,data=match05)
# tfit2 <- maketarget(~longitude+latitude,window=.5,data=match05)
#
# # nonparametric control for dcbd
#
# fit <- semip(lnprice~lnland+lnbldg+rooms+bedrooms+bathrooms+centair+fireplace+brick+
# garage1+garage2+ age+rr, nonpar=~dcbd, data=match05,targetfull=tfit1)
#
# # nonparametric controls for longitude and latitude
#
# fit <- semip(lnprice~lnland+lnbldg+rooms+bedrooms+bathrooms+centair+fireplace+brick+
# garage1+garage2+ age+rr+dcbd, nonpar=~longitude+latitude, data=match05, targetfull=tfit2,
# distance="Latlong")
#
# # Conditionally parametric model: y = XB + dcbd*lambda(longitude,latitude) + u
# fit <- semip(lnprice~lnland+lnbldg+rooms+bedrooms+bathrooms+centair+fireplace+
# brick+garage1+garage2+age+rr, nonpar=~longitude+latitude, conpar=~dcbd,
# data=match05, distance="Latlong",targetfull=tfit1)
#
# # Conditional parametric model: y = XB + Z*lambda(longitude,latitude) + u
# # Z = (dcbd,lnland,lnbldg,age)
# fit <- semip(lnprice~rooms+bedrooms+bathrooms+centair+fireplace+brick+
# garage1+garage2+rr, nonpar=~longitude+latitude, conpar=~dcbd+lnland+lnbldg+age,
# data=match05, distance="Latlong",targetfull=tfit2)
# ## End(Not run)
Run the code above in your browser using DataLab