Usage
dfldens(y,lgtform,window=0,bandwidth=0,kern="tcub",probit=FALSE, graph=TRUE,yname="y",alldata=FALSE,data=NULL)
Arguments
y
The dependent variable for which the counterfactual density is estimated. The data frame must be specified if it has not been attached, e.g.,
y=mydata$depvar.
lgtform
The formula for the logit or probit model for the time variable. The dependent variable should be a 0-1 variable with 1's representing
the later time period. Example: lgtform=timevar~x1+x2.
window
The window size for the kernel density function. Default: not used.
bandwidth
The bandwidth. Default: bandwidth = (.9*(quantile(y1,.75)-quantile(y1,.25))/1.34)*(n1^(-.20)),
specified by setting bandwidth = 0 and window = 0.
kern
Kernel weighting function. Default is the tri-cube. Options include "rect", "tria", "epan", "bisq", "tcub", "trwt", and "gauss".
probit
If TRUE, a probit model is used for the time variable rather than logit. Default: probit = FALSE.
graph
If TRUE, produces a graph showing the density function for time 1 and the counterfactual density. Default: graph=TRUE.
yname
The name to be used for the variable whose density functions are drawn when graph=T. Default: yname = "y".
alldata
If TRUE, the density functions are calculated using each observation in turn as a target value.
When alldata=F, densities are calculated at a set of points chosen by the locfit program using an adaptive decision tree approach,
and the smooth12 command is used to interpolate to the full set of observations.
data
A data frame with the variables for the logit or probit model specified by lgtform. Note: the data frame for y must be
specified even if it is part of data.
Details
The dfldens command first calculates kernel density estimates for y in time period timevar = 1.
The density estimate at target point y is $f(y_1) = (1/(hn_1)) \sum_i K((y_{1i} - y_1)/h)$.
The following kernel weighting functions are available:Kernel
Call abbreviation |
Kernel function K(z) |
Rectangular |
``rect'' |
$1/2 * I(|z|<1)$ <="" td="">
1)$> |
Triangular |
``tria'' |
$(1-|z|) * I(|z|<1)$< td="">
|
Epanechnikov |
1)$<>
``epan'' |
$3/4 * (1-z^2)*I(|z| < 1)$ |
Bi-Square |
``bisq'' |
$15/16 * (1-z^2)^2 * I(|z| < 1)$ |
Tri-Cube |
``tcub'' |
$70/81 * (1-|z|^3)^3 * I(|z| < 1)$ |
Tri-Weight |
``trwt'' |
$35/32 * (1-z^2)^3 * I(|z| < 1)$ |
Gaussian |
``gauss'' |
$2pi^{-.5} exp(-z^2/2)$ |
By default, dfldens uses a tri-cube kernel with a fixed bandwidth of h = (.9*(quantile(y1,.75)-quantile(y1,.25))/1.34)*(n1^(-.20)).
The results are stored in dtarget1 and dhat1.
The counterfactual density is an estimate of the density function for y in time 1 if the explanatory variables
listed in lgtform were equal to their time 0 values.
DiNardo, Fortin, and Lemieux (1996) show that the the following re-weighting of $f(y_1)$ is an estimate of the counterfactual density:
$(1/(hn_1)) \sum_i \tau_i K((y_{1i} - y_1)/h)$.
The weights are given by $tau_i = (P(x_i)/(1-P(x_i)))/(p/(1-p)) $, where $p = n_0/(n_0 + n_1))$ and
$P(x_i))$ is the estimated probability that timevar = 0 from the estimated logit or probit regression of timevar on X.
If X includes a single variable x, the counterfactual density shows how the $f(y_1)$ would change if $x = x_0$ rather than $x_1$.
Alternatively, X can include multiple variables, in which case
the counterfactual density shows how the $f(y_1)$ would change if all of the variables in X were equal to their timevar = 0 values.