This function aims to calculate informative priors using historical
data by incorporating covariate information to enhance borrowing strength
and address prior-data conflicts.
Let \(G\) be the study indicator, where \(G = 1\) indicate patient is
from current control study, and \(G = 0\) indicate patient is from
historical control study. Given the covariates data \(X\), the propensity
score is defined as follows,
$$e(X) = \Pr(G = 1 | X),$$
where distance allows different methods to estimate the
propensity scores.
Calculate informative prior through PS matching is to identify a subset of
historical data (\(D_h^*\)) that have similar PS as current control data
(\(D\)). Various algorithms are available for PS matching, please refer
to method
. The informative prior can then be calculated based on
the matched historical dataset.
Alternative, we can utilize the inverse probability of treatment weighting
(IPTW) to adjust the distribution of \(X\) in historical data \(D_h\),
making it similar to that in \(D\). Specifically, for the \(i\)th
subject, we assign a weight \(\alpha_i\) to the outcome \(y_i\) in
\(D_h\) based on its PS \(e(X_i)\) and a fixed weight \(\alpha_i = 1\)
to \(X_i\) in \(D\), as follows:
$$\alpha_i = G_1 + (1 - G_i) \frac{e(X_i)}{1 - e(X_i)}.$$
To avoid extremely large weights that may compromise IPTW, symmetric
trimming rule can be used to trim the tails of the PS distribution by
input trim
with default [0.1,0.9], that is to trim observations
whose estimated PS is outside of this range.
To standardized \(\alpha\), we compute the effective sample size (ESS),
which approximately reflects the level of precision or equivalently its
sample size, retained in the sample after weight as
\(n^{*}_h = (\sum \alpha_i)^2 / \sum{\alpha_i^2}\). The standardized weight
is given by
$$\alpha_i^{*} = G_i + (1 - G_i)\frac{G_i}{\sum{\alpha_i} / n_h^{*}}.$$
For binary endpoint \(Y \sim Ber(\theta)\), the informative
prior \(\pi_1(\theta)\) can be constructed as follows,
$$\pi_1(\theta) \propto L(\theta | D_h, \alpha^{*}) \pi_0(\theta)
= Beta(a + \sum \alpha_i^{*}y_i, b + n_h^* - \sum \alpha_i^{*}y_i )\},$$
where \(\pi_0(\theta)\) is a non-informative prior, a natural choice is
\(Beta(a, b)\), with \(a = b = 1\).
For continuous endpoint \(Y \sim N(0, \sigma^2)\), suppose \(\sigma^2\)
is unknown, with non-informative prior \(p(\theta, \sigma^2) \propto 1/\sigma^2\),
\(\pi_1(\theta)\) follows a student-\(t\) distribution with degree of
freedom \(n_h^{*} - 1\). Given that \(n_h^{*}\) is moderate and large,
it can be approximated by a normal distribution
\(N(\bar{y}^{*}, {s^{*}}^2 / n_h^{*})\) with
$$\bar{y}^{*} = \sum \alpha_i^* y_i / \alpha_i^*, ~~ {s^{*}}^2 =
\sum \alpha_i^* (y_i - \bar{y}^{*})^2 / (n_h^{*} - 1).$$