Learn R Programming

McSpatial (version 1.1.1)

dfldens: Counterfactual Kernel Density Functions

Description

Uses the DiNardo, Fortin, and Lemieux approach to re-weight kernel density functions based on values of an explanatory variable from an earlier period.

Usage

dfldens(y,lgtform,window=0,bandwidth=0,kern="tcub",probit=FALSE,
  graph=TRUE,yname="y",alldata=FALSE,data=NULL)

Arguments

y
The dependent variable for which the counterfactual density is estimated. The data frame must be specified if it has not been attached, e.g., y=mydata$depvar.
lgtform
The formula for the logit or probit model for the time variable. The dependent variable should be a 0-1 variable with 1's representing the later time period. Example: lgtform=timevar~x1+x2.
window
The window size for the kernel density function. Default: not used.
bandwidth
The bandwidth. Default: bandwidth = (.9*(quantile(y1,.75)-quantile(y1,.25))/1.34)*(n1^(-.20)), specified by setting bandwidth = 0 and window = 0.
kern
Kernel weighting function. Default is the tri-cube. Options include "rect", "tria", "epan", "bisq", "tcub", "trwt", and "gauss".
probit
If TRUE, a probit model is used for the time variable rather than logit. Default: probit = FALSE.
graph
If TRUE, produces a graph showing the density function for time 1 and the counterfactual density. Default: graph=TRUE.
yname
The name to be used for the variable whose density functions are drawn when graph=T. Default: yname = "y".
alldata
If TRUE, the density functions are calculated using each observation in turn as a target value. When alldata=F, densities are calculated at a set of points chosen by the locfit program using an adaptive decision tree approach
data
A data frame with the variables for the logit or probit model specified by lgtform. Note: the data frame for y must be specified even if it is part of data.

Value

  • targetThe vector of target values for y for the density functions.
  • dtarget1The vector of densities in period 1 at the target values of y.
  • dtarget10The counterfactual densities in period 1 at the target values of y.
  • dhat1The vector of densities in period 1 at the actual values of y.
  • dhat10The counterfactual densities in period 1 at the actual values of y.

Details

The dfldens command first calculates kernel density estimates for y in time period timevar = 1. The density estimate at target point y is $f(y_1) = (1/(hn_1)) \sum_i K((y_{1i} - y_1)/h)$. The following kernel weighting functions are available: lll{ Kernel Call abbreviation Kernel function K(z) Rectangular ``rect'' $\frac{1}{2} I(|z| <1)$ triangular="" ``tria''="" $(1-|z|)i(|z|<1)$="" epanechnikov="" ``epan''="" $\frac{3}{4}="" (1-z^2)="" *="" i(|z|="" <1)$="" bi-square="" ``bisq''="" $\frac{15}{16}="" (1-z^2)^2="" tri-cube="" ``tcub''="" $\frac{70}{81}="" (1-|z|^3)^3="" tri-weight="" ``trwt''="" $\frac{35}{32}="" (1-z^2)^3="" gaussian="" ``gauss''="" $(2\pi)^{-.5}="" e^{-z^2="" 2}$="" }="" by="" default,="" dfldens uses a tri-cube kernel with a fixed bandwidth of h = (.9*(quantile(y1,.75)-quantile(y1,.25))/1.34)*(n1^(-.20)). The results are stored in dtarget1 and dhat1. The counterfactual density is an estimate of the density function for y in time 1 if the explanatory variables listed in lgtform were equal to their time 0 values. DiNardo, Fortin, and Lemieux (1996) show that the the following re-weighting of $f(y_1)$ is an estimate of the counterfactual density: $(1/(hn_1)) \sum_i \tau_i K((y_{1i} - y_1)/h)$. The weights are given by $tau_i = (P(x_i)/(1-P(x_i)))/(p/(1-p))$, where $p = n_0/(n_0 + n_1))$ and $P(x_i))$ is the estimated probability that timevar = 0 from the estimated logit or probit regression of timevar on X. If X includes a single variable x, the counterfactual density shows how the $f(y_1)$ would change if $x = x_0$ rather than $x_1$. Alternatively, X can include multiple variables, in which case the counterfactual density shows how the $f(y_1)$ would change if all of the variables in X were equal to their timevar = 0 values.

References

DiNardo, J., N. Fortin, and T. Lemieux, "Labor Market Institutions and the Distribution of Wages, 1973-1992: A Semi-Parametric Approach," Econometrica 64 (1996), 1001-1044. Leibbrandt, Murray, James A. Levinsohn, and Justin McCrary, "Incomes in South Africa after the Fall of Apartheid," Journal of Globalization and Development 1 (2010).

See Also

qregsim2

Examples

Run this code
data(matchdata)
matchdata$year05 <- matchdata$year==2005
fit <- dfldens(matchdata$lnprice, year05~lnland+lnbldg, window=.2, 
  yname = "Log of Sale Price", data=matchdata)
matchdata$age <- matchdata$year - matchdata$yrbuilt
fit <- dfldens(matchdata$lnprice, year05~age, window=.2, 
  yname="Log of Sale Price", data=matchdata)

Run the code above in your browser using DataLab