bernoulli_RL_cdf: Cumulative distribution function (cdf) of Run Length for Bernoulli CUSUM

Description

Calculate the cdf of the Run Length of the Bernoulli CUSUM, starting from initial value between 0 and h, using Markov Chain methodology.

Usage

bernoulli_RL_cdf(h, x, n_grid, glmmod, theta, theta_true, p0, p1,
  smooth_prob = FALSE, exact = TRUE)

Value

A list containing:

Fr_0: A numeric value indicating the probability of the run length being smaller than x.
Fr: A data.frame containing the cumulative distribution function of the run length depending on the state in which the process starts (E_0, E_1, ..., E_n_grid-1)

start_val:

Starting value of the CUSUM, corresponding to the discretized state spaces E_i;
P(K <= x):

Value of the cdf at x for the CUSUM with initial value start_val;
R: A transition probability matrix containing the transition probabilities between states $E_0, \ldots, E_{t-1}$. $R_{i,j}$ is the transition probability from state i to state j.

The value of ARL_0 will be printed to the console.

Arguments

h

Control limit for the Bernoulli CUSUM

x

Quantile at which to evaluate the cdf.

n_grid

Number of state spaces used to discretize the outcome space (when method = "MC") or number of grid points used for trapezoidal integration (when method = "SPRT"). Increasing this number improves accuracy, but can also significantly increase computation time.

glmmod

Generalized linear regression model used for risk-adjustment as produced by the function glm(). Suggested:
glm(as.formula("(survtime <= followup) & (censorid == 1) ~ covariates"), data = data).
Alternatively, a list containing the following elements:

formula:: a formula() in the form ~ covariates;

coefficients:

a named vector specifying risk adjustment coefficients for covariates. Names must be the same as in formula and colnames of data.

theta

The $\theta$ value used to specify the odds ratio $e^\theta$ under the alternative hypothesis. If $\theta >= 0$, the average run length for the upper one-sided Bernoulli CUSUM will be determined. If $\theta < 0$, the average run length for the lower one-sided CUSUM will be determined. Note that $$p_1 = \frac{p_0 e^\theta}{1-p_0 +p_0 e^\theta}.$$

theta_true

The true log odds ratio $\theta$, describing the true increase in failure rate from the null-hypothesis. Default = log(1), indicating no increase in failure rate.

The baseline failure probability at entrytime + followup for individuals.

The alternative hypothesis failure probability at entrytime + followup for individuals.

smooth_prob

Should the probability distribution of failure under the null distribution be smoothed? Useful for small samples. Can only be TRUE when glmmod is supplied. Default = FALSE.

exact

Should the cdf be determined exactly (TRUE), or approximately (FALSE)? The approximation works well for large x, and can cut computation time significantly. Default = TRUE.

Details

Let $K$ denote the run length of the Bernoulli CUSUM with control limit h, then this function can be used to evaluate $P(K \leq x)$.

The formula on page 543 of Brook & Evans (1972) is used if exact = TRUE. When exact = FALSE, formula (3.9) on page 545 is used instead, approximating the transition matrix using its Jordan canonical form. This can save computation time considerably, but is not appropriate for small values of x.

References

Brook, D., & Evans, D. A. (1972). An Approach to the Probability Distribution of Cusum Run Length. Biometrika, 59(3), 539–549. tools:::Rd_expr_doi("10.2307/2334805")

Steiner, S. H., Cook, R. J., Farewell, V. T., & Treasure, T. (2000). Monitoring surgical performance using risk-adjusted cumulative sum charts. Biostatistics, 1(4), 441–452. tools:::Rd_expr_doi("10.1093/biostatistics/1.4.441")

Examples

Run this code

#Determine a risk-adjustment model using a generalized linear model.
#Outcome (failure within 100 days) is regressed on the available covariates:
glmmodber <- glm((survtime <= 100) & (censorid == 1)~ age + sex + BMI,
                  data = surgerydat, family = binomial(link = "logit"))
#Determine probability of run length being less than 600
prob600 <- bernoulli_RL_cdf(h = 2.5, x = 600, n_grid = 200, glmmod = glmmodber, theta = log(2))

Run the code above in your browser using DataLab