Assumes normal linear model for exposure given covariates, and additive normal processing errors and measurement errors acting on the poolwise mean exposure. Manuscript fully describing the approach is under review.
p_logreg_xerrors(g, y, xtilde, c = NULL, errors = "processing",
nondiff_pe = TRUE, nondiff_me = TRUE, constant_pe = TRUE,
prev = NULL, samp_y1y0 = NULL, approx_integral = TRUE,
estimate_var = TRUE, start_nonvar_var = c(0.01, 1),
lower_nonvar_var = c(-Inf, 1e-04), upper_nonvar_var = c(Inf, Inf),
jitter_start = 0.01, hcubature_list = list(tol = 1e-08),
nlminb_list = list(control = list(trace = 1, eval.max = 500, iter.max =
500)), hessian_list = list(method.args = list(r = 4)),
nlminb_object = NULL)
Numeric vector with pool sizes, i.e. number of members in each pool.
Numeric vector with poolwise Y values, coded 0 if all members are controls and 1 if all members are cases.
Numeric vector (or list of numeric vectors, if some pools have replicates) with Xtilde values.
Numeric matrix with poolwise C values (if any), with one row for each pool. Can be a vector if there is only 1 covariate.
Character string specifying the errors that X is subject to.
Choices are "neither"
, "processing"
for processing error
only, "measurement"
for measurement error only, and "both"
.
Logical value for whether to assume the processing error variance is non-differential, i.e. the same in case pools and control pools.
Logical value for whether to assume the measurement error variance is non-differential, i.e. the same in case pools and control pools.
Logical value for whether to assume the processing error
variance is constant with pool size. If FALSE
, assumption is that
processing error variance increase with pool size such that, for example, the
processing error affecting a pool 2x as large as another has 2x the variance.
Numeric value specifying disease prevalence, allowing
for valid estimation of the intercept with case-control sampling. Can specify
samp_y1y0
instead if sampling rates are known.
Numeric vector of length 2 specifying sampling probabilities
for cases and controls, allowing for valid estimation of the intercept with
case-control sampling. Can specify prev
instead if it's easier.
Logical value for whether to use the probit approximation for the logistic-normal integral, to avoid numerically integrating X's out of the likelihood function.
Logical value for whether to return variance-covariance matrix for parameter estimates.
Numeric vector of length 2 specifying starting value for non-variance terms and variance terms, respectively.
Numeric vector of length 2 specifying lower bound for non-variance terms and variance terms, respectively.
Numeric vector of length 2 specifying upper bound for non-variance terms and variance terms, respectively.
Numeric value specifying standard deviation for mean-0
normal jitters to add to starting values for a second try at maximizing the
log-likelihood, should the initial call to nlminb
result
in non-convergence. Set to NULL
for no second try.
List of arguments to pass to
hcubature
for numerical integration. Only used if
approx_integral = FALSE
.
List of arguments to pass to nlminb
for log-likelihood maximization.
List of arguments to pass to
hessian
for approximating the Hessian matrix. Only
used if estimate_var = TRUE
.
Object returned from nlminb
in a
prior call. Useful for bypassing log-likelihood maximization if you just want
to re-estimate the Hessian matrix with different options.
List containing:
Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE
).
Returned nlminb
object from maximizing the
log-likelihood function.
Akaike information criterion (AIC).
Schisterman, E.F., Vexler, A., Mumford, S.L. and Perkins, N.J. (2010) "Hybrid pooled-unpooled design for cost-efficient measurement of biomarkers." Stat. Med. 29(5): 597--613.
Weinberg, C.R. and Umbach, D.M. (1999) "Using pooled exposure assessment to improve efficiency in case-control studies." Biometrics 55: 718--726.
Weinberg, C.R. and Umbach, D.M. (2014) "Correction to 'Using pooled exposure assessment to improve efficiency in case-control studies' by Clarice R. Weinberg and David M. Umbach; 55, 718--726, September 1999." Biometrics 70: 1061.
# NOT RUN {
# Load dataset containing (Y, Xtilde, C) values for pools of size 1, 2, and
# 3. Xtilde values are affected by processing error.
data(pdat1)
# Estimate log-OR for X and Y adjusted for C, ignoring processing error
fit1 <- p_logreg_xerrors(
g = pdat1$g,
y = pdat1$allcases,
xtilde = pdat1$xtilde,
c = pdat1$c,
errors = "neither"
)
fit1$theta.hat
# Repeat, but accounting for processing error. Closer to true log-OR of 0.5.
fit2 <- p_logreg_xerrors(
g = pdat1$g,
y = pdat1$allcases,
xtilde = pdat1$xtilde,
c = pdat1$c,
errors = "processing"
)
fit2$theta.hat
# }
Run the code above in your browser using DataLab