powered by
A subset of data relating to job postings on the Lightcast platform for demonstrating bias correction methods with ML-generated variables.
SD_data
A data frame with 16315 rows and 7 columns:
Character. City of the job posting
Character. Type of business (NAICS industry classification)
Integer. Unique identifier of the job posting
Numeric. Salary offered (response variable)
Numeric. Binary label generated via ML, indicating whether remote work is offered (subject to measurement error)
Character. Occupation code (SOC classification)
Character. Employment type (part time/full time)
if (FALSE) { data(SD_data) fit <- ols_bca(log(salary) ~ wfh_wham + soc_2021_2 + naics_2022_2, data = SD_data, fpr = 0.009, m = 1000) }
Run the code above in your browser using DataLab