deconvolve: Richardson-Lucy Deconvolution

Description

Performs a modified Richardson-Lucy iteration for the purpose of estimating incidence from reported incidence or mortality, conditional on a reporting probability and on a distribution of the time to reporting.

Usage

deconvolve(x, prob = 1, delay = 1,
           start, x.pad = 0, tol = 1, iter.max = 32L, complete = FALSE)

Value

A list with elements:

value: the result of updating start iter times then dividing by prob. If complete = TRUE, then value is a (d+n)-by-(1+iter) matrix containing start and the iter successive updates, each divided by prob.
chisq: the chi-squared statistics corresponding to value.
iter: the number of iterations performed.

subset(value, start == 0) is zero because zero is invariant under the iteration. If delay has l leading zeros and t

trailing zeros, then head(value, t) and tail(value, l) are NaN due to divide-by-zero in the iteration. (Conceptually, x and delay provide no information about incidence during those intervals.)

Arguments

x: a numeric vector of length n giving the number of infections or deaths reported during n successive time intervals of equal duration.
prob: a numeric vector of length d+n, d=length(delay)-1, such that prob[d+i] is the probability that an infection during interval i is eventually reported. prob of length 1 is recycled.
delay: a numeric vector of positive length such that delay[j] is the probability that an infection during interval i is reported during interval i+j-1, given that it is eventually reported. Hence delay[j] is the probability of a delay by j-1 intervals and d=length(delay)-1 is the maximum delay. delay need not sum to 1 but must not sum to 0.
start: a numeric vector of length d+n, d=length(delay)-1, giving a starting value for the iteration. start[d+i] estimates the expected number of infections during interval i that are eventually reported. If missing, then a starting value is generated by padding x on the left and right with d-d0 and d0 elements equal to x.pad, choosing d0=which.max(delay)-1. Note that 0 is invariant under the iteration, hence it can be desirable to set x.pad to a small (relative to max(x)) positive number.
x.pad: a non-negative number, used only when start is unset; see above.
tol: a tolerance indicating a stopping condition; see the reference. Set to 0 if you want to perform no fewer than iter.max iterations.
iter.max: the maximum number of iterations.
complete: a logical flag indicating if the result should preserve successive updates to start.

Details

Do note that temporal alignment of x (length n) and y=deconvolve(x, ...)$value (length or number of rows d+n) requires padding x on the left, as in cbind(x=c(rep(NA, d), x), y); see the examples.

References

Goldstein, E., Dushoff, J., Ma, J., Plotkin, J. B., Earn, D. J. D., & Lipsitch, M. (2009). Reconstructing influenza incidence by deconvolution of daily mortality time series. Proceedings of the National Academy of Sciences U. S. A., 106(51), 21825-21829. tools:::Rd_expr_doi("10.1073/pnas.0902958106")

Examples

Run this code

# \dontshow{
## for R_DEFAULT_PACKAGES=NULL
library(   stats, pos = "package:base", verbose = FALSE)
library(graphics, pos = "package:base", verbose = FALSE)
library(   utils, pos = "package:base", verbose = FALSE)
# }
##
## Example 1: simulation
##

set.seed(2L)
n <- 200L
d <- 50L
p <- 0.1
prob <- plogis(rlogis(d + n, location = qlogis(p), scale = 0.1))
delay <- c(0, diff(pgamma(0L:d, 12, 0.4)))

h <- function (x, a = 1, b = 1, c = 0) a * exp(-b * (x - c)^2)
ans <- floor(h(seq(-60, 60, length.out = d + n), a = 1000, b = 0.001))

x0 <- rbinom(d + n, ans, prob)
x <- tabulate(rep(1L:(d + n), x0) +
              sample(0L:d, size = sum(x0), replace = TRUE, prob = delay),
              d + n)[-(1L:d)]

str(D0 <- deconvolve(x, prob, delay, complete = FALSE))
str(D1 <- deconvolve(x, prob, delay, complete =  TRUE))

matplot(-(d - 1L):n,
        cbind(x0, c(rep(NA, d), x), prob * D0[["value"]], p * ans),
        type = c("p", "p", "p", "l"),
        col = c(1L, 1L, 2L, 4L), pch = c(16L, 1L, 16L, NA),
        lty = c(0L, 0L, 0L, 1L), lwd = c(NA, NA, NA, 3),
        xlab = "Time", ylab = "Count")
legend("topleft", NULL,
       c("actual", "actual+delay", "actual+delay+deconvolution", "p*h"),
       col = c(1L, 1L, 2L, 4L), pch = c(16L, 1L, 16L, NA),
       lty = c(0L, 0L, 0L, 1L), lwd = c(NA, NA, NA, 3),
       bty = "n")

plot(0L:D1[["iter"]], D1[["chisq"]],
     xlab = "Iterations", ylab = quote(chi^2))
abline(h = 1, lty = 2L)


##
## Example 2: application to pneumonia and influenza
##

data(pneumonia, package = "fastbeta")
x <- pneumonia[["series"]][["deaths"]]
delay <- pneumonia[["delay"]][["gpg"]]

n <- length(x)
d <- length(delay) - 1L
r <- 30L

D2 <- deconvolve(x = x, delay = delay, tol = 0, iter.max = r,
                 complete = TRUE)
stopifnot(D2[["iter"]] == r,
          identical(dim(D2[["value"]]), c(d + n, 1L + r)),
          length(D2[["chisq"]]) == 1L + r,
          min(D2[["chisq"]]) < 1)

## Subscript for the first, critical, and last values:
j2 <- c(1L, which.max(D2[["chisq"]] < 1), 1L + r)

matplot(x = seq(from = pneumonia[["series"]][1L, "date"] - d,
                by = 1, length.out = d + n),
        y = cbind(c(rep(NA, d), x), D2[["value"]][, j2]),
        type = "o",
        col = c(1L, 4L, 2L, 3L), pch = 1L, lty = 1L, lwd = 1,
        xlab = "1918", ylab = "deaths")
legend("topleft", NULL,
       c("observed", sprintf("after %d iterations", j2 - 1L)),
       col = c(1L, 4L, 2L, 3L), pch = 1L, lty = 1L, lwd = 1,
       bty = "n")

Run the code above in your browser using DataLab