sensitivity: Estimate sensitivity

Description

sensitivity estimates (1) marginal sensitivity and (2) sensitivity as a function of covariates X for a misclassified binary outcome.

Usage

sensitivity(Dstar, X, prev, r = NULL, weights = NULL)

Arguments

Dstar

Numeric vector containing observed disease status. Should be coded as 0/1

Numeric matrix with covariates in sensitivity model. Set to NULL to fit model with no covariates in sensitivity model. 'X' should not contain an intercept

marginal disease prevalence $P(D = 1)$ or patient-specific $P(D = 1|X)$ in population

(optional) marginal sampling ratio, $P(S = 1|D = 1) / P(S = 1|D = 0)$. Only one of 'r' and 'weights' can be specified. Default is `NULL`

weights

Optional vector of patient-specific weights used for selection bias adjustment. Only one of r and weights can be specified. Default is `NULL`

Value

a list with two elements: (1) `c_marg`, marginal sensitivity estimate $P(D* = 1|D = 1, S = 1)$, and (2) `c_X`, sensitivity as a function of X $P(D* = 1| D = 1, S = 1, X)$

Details

We are interested in modeling the relationship between binary disease status and covariates $Z$ using a logistic regression model. However, $D$ may be misclassified, and our observed data may not well-represent the population of interest. In this setting, we estimate parameters from the disease model using the following modeling framework.

Notation:

D: Binary disease status of interest.
D*: Observed binary disease status. Potentially a misclassified version of D. We assume D = 0 implies D* = 0.
S: Indicator for whether patient from population of interest is included in the analytical dataset.
Z: Covariates in disease model of interest.
W: Covariates in model for patient inclusion in analytical dataset (selection model).
X: Covariates in model for probability of observing disease given patient has disease (sensitivity model).

Model Structure:

Disease Model: $$logit(P(D=1|X)) = theta_0 + theta_Z Z$$
Selection Model: $$P(S=1|W,D)$$
Sensitivity Model: $$logit(P(D* = 1| D = 1, S = 1, X)) = beta_0 + beta_X X$$

References

Statistical inference for association studies using electronic health records: handling both selection bias and outcome misclassification Lauren J Beesley and Bhramar Mukherjee medRxiv 2019.12.26.19015859

Examples

Run this code

# NOT RUN {
library(SAMBA)
# These examples are generated from the vignette. See it for more details.

# Generate IPW weights from the true model
expit <- function(x) exp(x) / (1 + exp(x))
prob.WD <- expit(-0.6 + 1 * samba.df$D + 0.5 * samba.df$W)
weights <- nrow(samba.df) * (1  / prob.WD) / (sum(1 / prob.WD))

# Using marginal sampling ratio r ~ 2 and P(D=1)
sens <- sensitivity(samba.df$Dstar, samba.df$X, mean(samba.df$D),
                    r = 2)
# Using inverse probability of selection weights and P(D=1)
sens <- sensitivity(samba.df$Dstar, samba.df$X, prev = mean(samba.df$D),
                    weights = weights)
# }

Run the code above in your browser using DataLab