AF.cs: Attributable fraction for cross-sectional sampling designs.

Description

AF.cs estimates the model-based adjusted attributable fraction for data from cross-sectional sampling designs.

Usage

AF.cs(formula, data, exposure, clusterid)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model used for adjusting for confounders. The exposure and confounders should be specified as independent (right-hand side) variables. The outcome should be specified as dependent (left-hand side) variable. The formula is used to fit a logistic regression by glm.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment (formula), typically the environment from which the function is called.

exposure

the name of the exposure variable as a string. The exposure must be binary (0/1) where unexposed is coded as 0.

clusterid

the name of the cluster identifier variable as a string, if data are clustered.

Value

AF.est: estimated attributable fraction.
AF.var: estimated variance of AF.est. The variance is obtained by combining the delta method with the sandwich formula.
P.est: estimated factual proportion of cases; $Pr(Y=1)$.
P.var: estimated variance of P.est. The variance is obtained by the sandwich formula.
P0.est: estimated counterfactual proportion of cases if exposure would be eliminated; $Pr(Y0=1)$.
P0.var: estimated variance of P0.est. The variance is obtained by the sandwich formula.
fit: the fitted model. Fitted using logistic regression, glm.

Details

Af.cs estimates the attributable fraction for a binary outcome Y under the hypothetical scenario where a binary exposure X is eliminated from the population. The estimate is adjusted for confounders Z by logistic regression (glm). Let the AF be defined as $$AF=1-\frac{Pr(Y_0=1)}{Pr(Y=1)}$$ where $Pr(Y0 = 1)$ denotes the counterfactual probability of the outcome if the exposure would have been eliminated from the population and $Pr(Y = 1)$ denotes the factual probability of the outcome. If Z is sufficient for confounding control, then $Pr(Y0 = 1)$ can be expressed as $E_z{Pr(Y = 1 |X = 0,Z)}.$ The function uses logistic regression to estimate $Pr(Y=1|X=0,Z)$, and the marginal sample distribution of Z to approximate the outer expectation (Sjölander and Vansteelandt, 2012). If clusterid is supplied, then a clustered sandwich formula is used in all variance calculations.

References

Greenland, S. and Drescher, K. (1993). Maximum Likelihood Estimation of the Attributable Fraction from logistic Models. Biometrics 49, 865-872.

Sjölander, A. and Vansteelandt, S. (2011). Doubly robust estimation of attributable fractions. Biostatistics 12, 112-121.

Examples

Run this code

# Simulate a cross-sectional sample
expit <- function(x) 1 / (1 + exp( - x))
n <- 1000
Z <- rnorm(n = n)
X <- rbinom(n = n, size = 1, prob = expit(Z))
Y <- rbinom(n = n, size = 1, prob = expit(Z + X))

# Example 1: non clustered data from a cross-sectional sampling design
data <- data.frame(Y, X, Z)
AF.est.cs <- AF.cs(formula = Y ~ X + Z + X * Z, data = data, exposure = "X")
summary(AF.est.cs)

# Example 2: clustered data from a cross-sectional sampling design
# Duplicate observations in order to create clustered data
id <- rep(1:n, 2)
data <- data.frame(id = id, Y = c(Y, Y), X = c(X, X), Z = c(Z, Z))
AF.est.cs.clust <- AF.cs(formula = Y ~ X + Z + X * Z, data = data,
                         exposure = "X", clusterid = "id")
summary(AF.est.cs.clust)

Run the code above in your browser using DataLab