ivreg (version 0.5-0)

SchoolingReturns: U.S. Returns to Schooling Data

Description

Data from the U.S. National Longitudinal Survey of Young Men (NLSYM) in 1976 but using some variables dating back to earlier years.

Usage

data("SchoolingReturns", package = "ivreg")

Arguments

Format

A data frame with 3010 rows and 22 columns.

wage

Raw wages in 1976 (in cents per hour).

education

Education in 1976 (in years).

experience

Years of labor market experience, computed as age - education - 6.

ethnicity

Factor indicating ethnicity. Is the individual African-American ("afam") or not ("other")?

smsa

Factor. Does the individual reside in a SMSA (standard metropolitan statistical area) in 1976?

south

Factor. Does the individual reside in the South in 1976?

age

Age in 1976 (in years).

nearcollege

Factor. Did the individual grow up near a 4-year college?

nearcollege2

Factor. Did the individual grow up near a 2-year college?

nearcollege4

Factor. Did the individual grow up near a 4-year public or private college?

enrolled

Factor. Is the individual enrolled in college in 1976?

married

factor. Is the individual married in 1976?

education66

Education in 1966 (in years).

smsa66

Factor. Does the individual reside in a SMSA in 1966?

south66

Factor. Does the individual reside in the South in 1966?

feducation

Father's educational attainment (in years). Imputed with average if missing.

meducation

Mother's educational attainment (in years). Imputed with average if missing.

fameducation

Ordered factor coding family education class (from 1 to 9).

kww

Knowledge world of work (KWW) score.

iq

Normed intelligence quotient (IQ) score

parents14

Factor coding living with parents at age 14: both parents, single mother, step parent, other

library14

Factor. Was there a library card in home at age 14?

Details

Investigating the causal link of schooling on earnings in a classical model for wage determinants is problematic because it can be argued that schooling is endogenous. Hence, one possible strategy is to use an exogonous variable as an instrument for the years of education. In his well-known study, Card (1995) uses geographical proximity to a college when growing up as such an instrument, showing that this significantly increases both the years of education and the wage level obtained on the labor market. Using instrumental variables regression Card (1995) shows that the estimated returns to schooling are much higher than when simply using ordinary least squares.

The data are taken from the supplementary material for Verbeek (2004) and are based on the work of Card (1995). The U.S. National Longitudinal Survey of Young Men (NLSYM) began in 1966 and included 5525 men, then aged between 14 and 24. Card (1995) employs labor market information from the 1976 NLSYM interview which also included information about educational attainment. Out of the 3694 men still included in that wave of NLSYM, 3010 provided information on both wages and education yielding the subset of observations provided in SchoolingReturns.

The examples replicate the results from Verbeek (2004) who used the simplest specifications from Card (1995). Including further region or family background characteristics improves the model significantly but does not affect much the main coefficients of interest, namely that of years of education.

References

Card, D. (1995). Using Geographical Variation in College Proximity to Estimate the Return to Schooling. In: Christofides, L.N., Grant, E.K., and Swidinsky, R. (eds.), Aspects of Labour Market Behaviour: Essays in Honour of John Vanderkamp, University of Toronto Press, Toronto, 201-222.

Verbeek, M. (2004). A Guide to Modern Econometrics, 2nd ed. John Wiley.

Examples

Run this code
# NOT RUN {
## load data
data("SchoolingReturns", package = "ivreg")

## Table 5.1 in Verbeek (2004) / Table 2(1) in Card (1995)
## Returns to education: 7.4%
m_ols <- lm(log(wage) ~ education + poly(experience, 2, raw = TRUE) + ethnicity + smsa + south,
  data = SchoolingReturns)
summary(m_ols)

## Table 5.2 in Verbeek (2004) / similar to Table 3(1) in Card (1995)
m_red <- lm(education ~ poly(age, 2, raw = TRUE) + ethnicity + smsa + south + nearcollege,
  data = SchoolingReturns)
summary(m_red)

## Table 5.3 in Verbeek (2004) / similar to Table 3(5) in Card (1995)
## Returns to education: 13.3%
m_iv <- ivreg(log(wage) ~ education + poly(experience, 2, raw = TRUE) + ethnicity + smsa + south |
  poly(age, 2, raw = TRUE) + ethnicity + smsa + south + nearcollege,
  data = SchoolingReturns)
summary(m_iv)
# }

Run the code above in your browser using DataLab