Learn R Programming

xtvfreg (version 1.1.0)

nlswork_subset: National Longitudinal Survey of Young Women (Subset)

Description

A subset of 300 randomly sampled women from the National Longitudinal Survey of Young Women, 1968-1988. This is a subsample of the full nlswork dataset commonly used in Stata examples. The data contains labor market information for young women tracked over multiple years.

Usage

nlswork_subset

Arguments

Format

A data frame with approximately 2,400-2,700 observations (depending on sampling) and the following variables:

idcode

Individual identifier (numeric)

year

Survey year (numeric)

birth_yr

Year of birth (numeric)

age

Age in current year (numeric)

race

Race: 1=white, 2=black, 3=other (numeric or labeled)

msp

Marital status: 1=never married, 2=married, 3=separated/divorced/widowed (numeric or labeled)

nev_mar

1 if never married (numeric)

grade

Current grade completed (numeric)

collgrad

1 if college graduate (numeric)

not_smsa

1 if not in SMSA (Standard Metropolitan Statistical Area) (numeric)

c_city

1 if in central city (numeric)

south

1 if in south (numeric)

ind_code

Industry code (numeric)

occ_code

Occupation code (numeric)

union

1 if union member (numeric)

wks_ue

Weeks unemployed last year (numeric)

ttl_exp

Total work experience (years) (numeric)

tenure

Job tenure in years (numeric)

hours

Usual hours worked per week (numeric)

wks_work

Weeks worked last year (numeric)

ln_wage

Natural log of hourly wage (numeric)

Details

This dataset is a subset of the nlswork data available from Stata Press. It contains 300 randomly sampled individuals from the original 5,159 women, preserving all time periods for the selected individuals. The data is an unbalanced panel with varying numbers of observations per individual.

The subset was created using:


set.seed(123)
unique_ids <- unique(nlswork$idcode)
sampled_ids <- sample(unique_ids, size = 300, replace = FALSE)
nlswork_subset <- nlswork[nlswork$idcode %in% sampled_ids, ]

References

Center for Human Resource Research. (2002). NLS Handbook 2001. Columbus, OH: The Ohio State University.

Examples

Run this code
# Load the data
data(nlswork_subset)

# Examine structure
str(nlswork_subset)

# Summary statistics
summary(nlswork_subset$ln_wage)

# Panel structure
table(table(nlswork_subset$idcode))  # Distribution of obs per individual

if (FALSE) {
# Example analysis with xtvfreg
# Create race groups
nlswork_subset$race_group <- factor(nlswork_subset$race,
                                    levels = 1:2,
                                    labels = c("white", "black"))

# Create within and between components for tenure
nlswork_subset$m_tenure <- ave(nlswork_subset$tenure,
                               nlswork_subset$idcode,
                               FUN = function(x) mean(x, na.rm = TRUE))
nlswork_subset$d_tenure <- nlswork_subset$tenure - nlswork_subset$m_tenure

# Estimate varying effects model
result <- xtvfreg(
  formula = ln_wage ~ 1,
  data = subset(nlswork_subset, !is.na(ln_wage) & race %in% 1:2),
  group = "race_group",
  panel_id = "idcode",
  mean_vars = c("m_tenure", "d_tenure", "age"),
  var_vars = c("m_tenure", "age"),
  verbose = TRUE
)

# View results
summary(result)
}

Run the code above in your browser using DataLab