Learn R Programming

GeneralOaxaca (version 1.0)

GeneralOaxaca: General Blinder-Oaxaca Decomposition

Description

Blinder-Oaxaca decomposition for generalized linear model. It provide the twofold and threefold decomposition describe in Bauer and Sinning (2008), as the characteristic, coefficient and interaction part of the observed difference on the dependent variable between the two groups. Bootstrapped standard errors are calculated (e.g., Efron, 1979).

Usage

GeneralOaxaca(formula, family = stats::gaussian, data, groupInd, groupRef = "A", B = 1000, control = list())

Arguments

formula
an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted.
family
a description of the error distribution and link function to be used in the model. (See family for details of family functions.)
data
an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula).
groupInd
is an indicator variable that is TRUE (or equal to 1) when an observation belongs to Group A, and FALSE (or equal to 0) when it belongs to Group B
groupRef
Group of reference for the decomposition, by default Group A.
B
number of bootstrap replications for the calculation of standard errors
control
a list of parameters for controlling the fitting process.

Value

GeneralOaxaca returns the following results:
regoutput
List of two elements (names GroupA and GroupB) with the standard generalized linear model output in each group.
twofold
the twofold decomposition with the respect groupInd.
threefold
the threefold decomposition with the respect groupInd.
n
the size of each respective group.
summaryStat
descriptive statistic of the independent variable in each group.

Details

The twofold and threefold decomposition contains the characteristic and coefficient part (also the interaction for the threefold) of the decomposition, with their proportion with respect to the observed difference between groups. It also give the z value, p value and 95% confidence intervals computed using the bootstrapped standard errors. The regoutput are the results of the generalized linear model applied to data in each group (A and B). See glm for more details about the outputs.

References

T. Bauer and M. Sinning. An extension of the Blinder-Oaxaca decomposition to nonlinear models (2008). Advances in Statistical Analysis, Springer-Verlag.

B. Efron. Bootstrap Methods: Another Look at the Jackknife (1979). Annals of Statistics, 7(1), 1-26.

Examples

Run this code
data("chicago")
chicago$real.wage=exp(chicago$ln.real.wage)
formula=ln.real.wage~ age + female + LTHS + some.college + college + 
advanced.degree

# exemple with gamma distribution
BO_A <- GeneralOaxaca(formula,  family= Gamma, data=chicago, 
groupInd=chicago$foreign.born,B=100)
BO_A$twofold 
BO_A$regoutput$GroupA 
BO_A$threefold 

Run the code above in your browser using DataLab