powerprior_multivariate: Compute Power Prior for Multivariate Normal Data

Description

Computes the power prior for multivariate normal data using a conjugate Normal-Inverse-Wishart (NIW) representation.

Usage

powerprior_multivariate(
  historical_data,
  a0,
  mu0 = NULL,
  kappa0 = NULL,
  nu0 = NULL,
  Lambda0 = NULL
)

Value

A list of class "powerprior_multivariate" containing:

mu_n: Updated mean vector (p-dimensional)
kappa_n: Updated precision parameter (effective sample size)
nu_n: Updated degrees of freedom
Lambda_n: Updated scale matrix (p x p)
a0: Discounting parameter used
m: Sample size of historical data
p: Dimension (number of variables) of the data
xbar: Sample mean vector of historical data
Sx: Sum of squares and crossproducts matrix
vague_prior: Logical indicating if vague prior was used
mu0: Prior mean vector (if informative prior used)
kappa0: Prior precision parameter (if informative prior used)
nu0: Prior degrees of freedom (if informative prior used)
Lambda0: Prior scale matrix (if informative prior used)

Arguments

historical_data

Matrix or data frame where rows are observations and columns are variables. Must have at least 2 rows and 2 columns. Missing values are not automatically handled; rows with any NA values should be removed before calling this function.

a0

Discounting parameter in [0, 1]. Controls the degree of borrowing from historical data in the multivariate setting. Same interpretation as univariate case:

a0 = 0: No borrowing from historical data
a0 = 0.5: Moderate borrowing with half weight
a0 = 1: Full borrowing of historical information

mu0

Prior mean vector of length p (number of variables). Only used when specifying an informative initial prior. If NULL, a vague (non-informative) prior is used. Represents the prior belief about the center of the multivariate distribution across all variables.

kappa0

Prior precision parameter (scalar). Controls the concentration of the prior around mu0. Only used for informative priors. If NULL, vague prior is applied. Interpreted as the "prior effective sample size" for the mean estimate. Higher values indicate greater prior confidence.

nu0

Prior degrees of freedom for the inverse Wishart distribution. Only used for informative priors. If NULL, vague prior is applied. Must satisfy nu0 >= p. Higher values indicate greater prior confidence in the covariance structure. Typical values range from p to 2p for informative priors.

Lambda0

Prior scale matrix (p x p, symmetric positive definite). Only used for informative priors. If NULL, vague prior is applied. Represents the prior belief about the covariance structure. Larger values correspond to greater prior uncertainty about variable spread and relationships. If Lambda0 is provided, must be symmetric and positive definite.

Details

Background on Multivariate Power Priors

The power prior framework extends naturally to multivariate normal data when the mean vector $\mu$ and covariance matrix $\Sigma$ are jointly unknown. This is essential for modern applications involving multiple correlated endpoints, such as clinical trials measuring multiple health outcomes simultaneously.

The power prior for multivariate data is defined analogously to the univariate case:

$$P(\boldsymbol{\mu}, \boldsymbol{\Sigma} | \mathbf{X}, a_0) = L(\boldsymbol{\mu}, \boldsymbol{\Sigma} | \mathbf{X})^{a_0} P(\boldsymbol{\mu}, \boldsymbol{\Sigma})$$

where:

$\mathbf{X}$ is the m x p historical data matrix
$\boldsymbol{\mu}$ is the p-dimensional mean vector
$\boldsymbol{\Sigma}$ is the p x p covariance matrix
$a_0 \in [0, 1]$ is the discounting parameter

Conjugacy via Normal-Inverse-Wishart Distribution

The key advantage of this implementation is that power priors applied to multivariate normal data with Normal-Inverse-Wishart (NIW) conjugate initial priors remain in closed form as NIW distributions. This preserves:

Exact posterior computation without approximation
Closed-form parameter updates and marginalization
Efficient sampling from standard distributions
Computational tractability for high-dimensional problems (within reason)
Natural joint modeling of correlations via the covariance structure

For practitioners, this means you can incorporate historical information on multiple correlated endpoints while maintaining full Bayesian rigor and computational efficiency.

Informative vs. Vague Priors

Informative Initial Prior (all of mu0, kappa0, nu0, Lambda0 provided):

Uses a Normal-Inverse-Wishart (NIW) conjugate prior with hierarchical structure:

$$\boldsymbol{\mu} | \boldsymbol{\Sigma} \sim N_p(\boldsymbol{\mu}_0, \boldsymbol{\Sigma} / \kappa_0)$$ $$\boldsymbol{\Sigma} \sim \text{Inv-Wishart}(\nu_0, \boldsymbol{\Lambda}_0)$$

The power prior parameters are updated:

$$\boldsymbol{\mu}_n = \frac{a_0 m \overline{\mathbf{x}} + \kappa_0 \boldsymbol{\mu}_0}{a_0 m + \kappa_0}$$

$$\kappa_n = a_0 m + \kappa_0$$

$$\nu_n = a_0 m + \nu_0$$

$$\boldsymbol{\Lambda}_n = a_0 \mathbf{S}_x + \boldsymbol{\Lambda}_0 + \frac{\kappa_0 a_0 m}{a_0 m + \kappa_0} (\boldsymbol{\mu}_0 - \overline{\mathbf{x}}) (\boldsymbol{\mu}_0 - \overline{\mathbf{x}})^T$$

where $m$ is sample size, $\overline{\mathbf{x}}$ is the sample mean vector, and $\mathbf{S}_x = \sum_{i=1}^m (\mathbf{x}_i - \overline{\mathbf{x}}) (\mathbf{x}_i - \overline{\mathbf{x}})^T$ is the sum of squares and crossproducts matrix.

Vague (Non-informative) Initial Prior (all of mu0, kappa0, nu0, Lambda0 are NULL):

Uses a non-informative prior $P(\boldsymbol{\mu}, \boldsymbol{\Sigma}) \propto |\boldsymbol{\Sigma}|^{-(p+1)/2}$ that places minimal constraints on parameters. The power prior parameters simplify to:

$$\boldsymbol{\mu}_n = \overline{\mathbf{x}}$$

$$\kappa_n = a_0 m$$

$$\nu_n = a_0 m - 1$$

$$\boldsymbol{\Lambda}_n = a_0 \mathbf{S}_x$$

The vague prior is recommended when there is minimal prior information, or when you want the analysis driven primarily by the historical data.

Parameter Interpretation in Multivariate Setting

Effective Sample Size ($\kappa_n$): Quantifies how much "effective historical data" has been incorporated. The formula $\kappa_n = a_0 m + \kappa_0$ shows the discounted historical sample size combined with prior precision. This controls the concentration of the posterior distribution for the mean vector.

Mean Vector ($\mu_n$): The updated mean is a precision-weighted average: $\boldsymbol{\mu}_n = \frac{a_0 m \overline{\mathbf{x}} + \kappa_0 \boldsymbol{\mu}_0}{a_0 m + \kappa_0}$ This naturally balances the historical sample mean and prior mean, with weights proportional to their respective precisions.

Degrees of Freedom ($\nu_n$): Controls tail behavior and the concentration of the Wishart distribution. Higher values indicate greater confidence in the covariance estimate. The minimum value needed is p (number of variables) for the Inverse-Wishart to be well-defined; $\nu_n \geq p$ is always maintained.

Scale Matrix ($\Lambda_n$): The p x p scale matrix that captures both the dispersion of individual variables and their correlations. The term $\frac{\kappa_0 a_0 m}{a_0 m + \kappa_0} (\boldsymbol{\mu}_0 - \overline{\mathbf{x}}) (\boldsymbol{\mu}_0 - \overline{\mathbf{x}})^T$ adds uncertainty when the historical mean conflicts with the prior mean, naturally reflecting disagreement between data sources.

Practical Considerations

Dimension: This implementation works efficiently for moderate-dimensional problems (typically p $\leq$ 10). For higher dimensions, consider data reduction techniques or structural assumptions on the covariance matrix.

Prior Specification: When specifying Lambda0, ensure it is symmetric positive definite. A simple approach is to use a multiple of the identity matrix (e.g., Lambda0 = diag(p)) for a weakly informative prior.

Discounting: The same a0 parameter is used for all variables and their correlations. If you suspect differential reliability of historical information across variables, consider multiple analyses with different a0 values and sensitivity analyses.

References

Huang, Y., Yamaguchi, Y., Homma, G., Maruo, K., & Takeda, K. (2024). "Conjugate Representation of Power Priors for Efficient Bayesian Analysis of Normal Data." (unpublished).

Ibrahim, J. G., & Chen, M. H. (2000). "Power prior distributions for regression models." Statistical Science, 15(1), 46-60.

Gelman, A., Carlin, J. B., Stern, H. S., et al. (2013). Bayesian Data Analysis (3rd ed.). CRC Press.

Examples

Run this code

# Generate multivariate historical data with correlation
library(MASS)
Sigma_true <- matrix(c(4, 1, 1, 2), 2, 2)
historical <- mvrnorm(50, mu = c(10, 5), Sigma = Sigma_true)

# Compute power prior with vague prior
pp <- powerprior_multivariate(historical, a0 = 0.5)
print(pp)

# Compute power prior with informative prior
pp_inform <- powerprior_multivariate(
  historical,
  a0 = 0.5,
  mu0 = c(10, 5),
  kappa0 = 1,
  nu0 = 5,
  Lambda0 = diag(2)
)
print(pp_inform)

Run the code above in your browser using DataLab