Learn R Programming

MLBC (version 0.2.1)

one_step: One-step maximum likelihood estimation

Description

Maximum likelihood estimation of the regression model, treating the generated covariate as a noisy proxy for the true latent variable. This method is particularly useful when an estimate of the false positive rate is not available. The variance of the estimates is approximated via the inverse Hessian at the optimum.

Usage

one_step(
  Y,
  Xhat = NULL,
  homoskedastic = FALSE,
  distribution = c("normal", "t", "laplace", "gamma", "beta"),
  nu = 4,
  gshape = 2,
  gscale = 1,
  ba = 2,
  bb = 2,
  intercept = TRUE,
  gen_idx = 1,
  data = parent.frame(),
  ...
)

# S3 method for default one_step( Y, Xhat, homoskedastic = FALSE, distribution = c("normal", "t", "laplace", "gamma", "beta"), nu = 4, gshape = 2, gscale = 1, ba = 2, bb = 2, intercept = TRUE, gen_idx = 1, ... )

# S3 method for formula one_step( Y, Xhat = NULL, homoskedastic = FALSE, distribution = c("normal", "t", "laplace", "gamma", "beta"), nu = 4, gshape = 2, gscale = 1, ba = 2, bb = 2, intercept = TRUE, gen_idx = 1, data = parent.frame(), ... )

Value

An object of class mlbc_fit and mlbc_onestep with:

  • coef: estimated regression coefficients

  • vcov: variance-covariance matrix

Arguments

Y

numeric response vector, or a one-sided formula

Xhat

numeric matrix of regressors (if Y is numeric)

homoskedastic

logical; if TRUE, assumes a common error variance; otherwise, the error variance is allowed to vary with the true latent binary variable

distribution

character; distribution for error terms. One of "normal", "t", "laplace", "gamma", "beta"

nu

numeric; degrees of freedom (for Student-t distribution)

gshape

numeric; shape parameter (for Gamma distribution)

gscale

numeric; scale parameter (for Gamma distribution)

ba

numeric; alpha parameter (for Beta distribution)

bb

numeric; beta parameter (for Beta distribution)

intercept

logical; if TRUE, prepend an intercept column to Xhat

gen_idx

integer; index (1-based) of the binary ML-generated variable. If not specified, defaults to the first non-intercept variable

data

data frame (if Y is a formula)

...

unused

Usage Options

Option 1: Formula Interface

  • Y: A one-sided formula string

  • data: Data frame containing the variables referenced in the formula

Option 2: Array Interface

  • Y: Response variable vector

  • Xhat: Design matrix of covariates

Examples

Run this code
# Load the remote work dataset
data(SD_data)

# Basic one-step estimation
fit_onestep <- one_step(log(salary) ~ wfh_wham + soc_2021_2 + employment_type_name,
                        data = SD_data)
summary(fit_onestep)

# With different error distribution
fit_t <- one_step(log(salary) ~ wfh_wham + soc_2021_2,
                  data = SD_data,
                  distribution = "t",
                  nu = 4)
summary(fit_t)

# Homoskedastic errors
fit_homo <- one_step(log(salary) ~ wfh_wham + soc_2021_2,
                     data = SD_data,
                     homoskedastic = TRUE)
summary(fit_homo)

Run the code above in your browser using DataLab