generate_posterior: Generate Posterior Samples from Fitted Lagrangian Multiplier Smoothing Spline

Description

Draws samples from the posterior distribution of model parameters and optionally generates posterior predictive samples. Uses Laplace approximation for non-Gaussian responses.

Usage

generate_posterior(
  object,
  new_sigmasq_tilde = object$sigmasq_tilde,
  new_predictors = object$X[[1]],
  theta_1 = 0,
  theta_2 = 0,
  posterior_predictive_draw = function(N, mean, sqrt_dispersion, ...) {
     rnorm(N,
    mean, sqrt_dispersion)
 },
  draw_dispersion = TRUE,
  include_posterior_predictive = FALSE,
  num_draws = 1,
  ...
)

Value

A list containing the following components:

post_draw_coefficients: List of length num_draws containing posterior coefficient samples.
post_draw_sigmasq: List of length num_draws containing posterior dispersion parameter samples (or repeated point estimate if draw_dispersion = FALSE).
post_pred_draw: List of length num_draws containing posterior predictive samples (only if include_posterior_predictive = TRUE).

Arguments

object

A fitted lgspline model object containing model parameters and fit statistics

new_sigmasq_tilde

Numeric; Dispersion parameter for sampling. Controls variance of posterior draws. Default object$sigmasq_tilde

new_predictors

Matrix; New data matrix for posterior predictive sampling. Should match structure of original predictors. Default = predictors as input to lgspline.

theta_1

Numeric; Shape parameter for prior gamma distribution of inverse-dispersion. Default 0 implies uniform prior

theta_2

Numeric; Rate parameter for prior gamma distribution of inverse-dispersion. Default 0 implies uniform prior

posterior_predictive_draw

Function; Random number generator for posterior predictive samples. Takes arguments:

N: Integer; Number of samples to draw
mean: Numeric vector; Predicted mean values
sqrt_dispersion: Numeric vector; Square root of dispersion parameter
...: Additional arguments to pass through

draw_dispersion

Logical; whether to sample the dispersion parameter from its posterior distribution. When FALSE, uses point estimate. Default TRUE

include_posterior_predictive

Logical; whether to generate posterior predictive samples for new observations. Default FALSE

num_draws

Integer; Number of posterior draws to generate. Default 1

...

Additional arguments passed to internal sampling routines.

Details

Implements posterior sampling using the following approach:

Coefficient posterior: Assumes sqrt(N)B ~ N(Btilde, sigma^2UG)
Dispersion parameter: Sampled from inverse-gamma distribution with user-specified prior parameters (theta_1, theta_2) and model-based sufficient statistics
Posterior predictive: Generated using custom sampling function, defaulting to Gaussian for standard normal responses

For the dispersion parameter, the sampling process follows for a fitted lgspline object "model_fit" (where unbias_dispersion is coerced to 1 if TRUE, 0 if FALSE)


shape <-  theta_1 + 0.5 * (model_fit$N - model_fit$unbias_dispersion * model_fit$trace_XUGX)
rate <- theta_2 + 0.5 * (model_fit$N - model_fit$unbias_dispersion * model_fit$trace_XUGX) * new_sigmasq_tilde
post_draw_sigmasq <- 1/rgamma(1, shape, rate)

Users can modify sufficient statistics by adjusting theta_1 and theta_2 relative to the default model-based values.

Examples

Run this code


## Generate example data
t <- runif(1000, -10, 10)
true_y <- 2*sin(t) + -0.06*t^2
y <- rnorm(length(true_y), true_y, 1)

## Fit model (using unstandardized expansions for consistent inference)
model_fit <- lgspline(t, y,
                      K = 7,
                      standardize_expansions_for_fitting = FALSE)

## Compare Wald (= t-intervals here) to Monte Carlo credible intervals
# Get Wald intervals
wald <- wald_univariate(model_fit,
                        cv = qt(0.975, df = model_fit$trace_XUGX))
wald_bounds <- cbind(wald[["interval_lb"]], wald[["interval_ub"]])

## Generate posterior samples (uniform prior)
post_draws <- generate_posterior(model_fit,
                                 theta_1 = -1,
                                 theta_2 = 0,
                                 num_draws = 2000)

## Convert to matrix and compute credible intervals
post_mat <- Reduce('cbind',
                   lapply(post_draws$post_draw_coefficients,
                          function(x) Reduce("rbind", x)))
post_bounds <- t(apply(post_mat, 1, quantile, c(0.025, 0.975)))

## Compare intervals
print(round(cbind(wald_bounds, post_bounds), 4))