
Performs an additive bias correction to regressions that include a binary covariate generated by AI/ML. This method requires an external estimate of the false-positive rate. Standard errors are adjusted to account for uncertainty in the false-positive rate estimate.
ols_bca(
Y,
Xhat = NULL,
fpr,
m,
data = parent.frame(),
intercept = TRUE,
gen_idx = 1,
...
)# S3 method for default
ols_bca(
Y,
Xhat,
fpr,
m,
data = parent.frame(),
intercept = TRUE,
gen_idx = 1,
...
)
# S3 method for formula
ols_bca(
Y,
Xhat = NULL,
fpr,
m,
data = parent.frame(),
intercept = TRUE,
gen_idx = 1,
...
)
An object of class mlbc_fit
and mlbc_bca
with:
coef
: bias-corrected coefficient estimates (ML-slope first, other slopes, intercept last)
vcov
: adjusted variance-covariance matrix for those coefficients
numeric response vector, or a one-sided formula
numeric matrix of regressors (if Y
is numeric); the ML-regressor is column gen_idx
numeric; estimated false-positive rate of the ML regressor
integer; size of the external sample used to estimate the classifier's false-positive rate. Can be set to a large number when the false-positive rate is known exactly
data frame (if Y
is a formula)
logical; if TRUE
, prepends a column of 1's to Xhat
integer; 1-based index of the ML-generated variable to apply bias correction to. If not specified, defaults to the first non-intercept variable
unused
Option 1: Formula Interface
Y
: A one-sided formula string
data
: Data frame containing the variables referenced in the formula
Option 2: Array Interface
Y
: Response variable vector
Xhat
: Design matrix of covariates
# Load the remote work dataset
data(SD_data)
# Formula interface
fit_bca <- ols_bca(log(salary) ~ wfh_wham + soc_2021_2 + employment_type_name,
data = SD_data,
fpr = 0.009, # estimated false positive rate
m = 1000) # validation sample size
summary(fit_bca)
# Array interface
Y <- log(SD_data$salary)
Xhat <- model.matrix(~ wfh_wham + soc_2021_2, data = SD_data)[, -1]
fit_bca2 <- ols_bca(Y, Xhat, fpr = 0.009, m = 1000, intercept = TRUE)
summary(fit_bca2)
Run the code above in your browser using DataLab