svb.fit: Fit sparse variational Bayesian proportional hazards models.

Description

Fit sparse variational Bayesian proportional hazards models.

Usage

svb.fit(
  Y,
  delta,
  X,
  lambda = 1,
  a0 = 1,
  b0 = ncol(X),
  mu.init = NULL,
  s.init = rep(0.05, ncol(X)),
  g.init = rep(0.5, ncol(X)),
  maxiter = 1000,
  tol = 0.001,
  alpha = 1,
  center = TRUE,
  verbose = TRUE
)

Arguments

Failure times.

delta

Censoring indicator, 0: censored, 1: uncensored.

Design matrix.

lambda

Penalisation parameter, default: lambda=1.

Beta distribution parameter, default: a0=1.

Beta distribution parameter, default: b0=ncol(X).

mu.init

Initial value for the mean of the Gaussian component of the variational family ( $μ$ ), default taken from LASSO fit.

s.init

Initial value for the standard deviations of the Gaussian component of the variational family ( $s$ ), default: rep(0.05, ncol(X)).

g.init

Initial value for the inclusion probability ( $γ$ ), default: rep(0.5, ncol(X)).

maxiter

Maximum number of iterations, default: 1000.

tol

Convergence tolerance, default: 0.001.

alpha

The elastic-net mixing parameter used for initialising mu.init. When alpha=1 the lasso penalty is used and alpha=0 the ridge penalty, values between 0 and 1 give a mixture of the two penalties, default: 1.

center

Center X prior to fitting, increases numerical stability, default: TRUE

verbose

Print additional information: default: TRUE.

Value

Returns a list containing:

beta_hat

Point estimate for the coefficients $β$ , taken as the mean under the variational approximation. ${\hat{β}}_{j} = E_{\tilde{Π}} [β_{j}] = γ_{j} μ_{j}$ .

inclusion_prob

Posterior inclusion probabilities. Used to describe the posterior probability a coefficient is non-zero.

Final value for the means of the Gaussian component of the variational family $μ$ .

Final value for the standard deviation of the Gaussian component of the variational family $s$ .

Final value for the inclusion probability ( $γ$ ).

lambda

Value of lambda used.

Value of $α_{0}$ used.

Value of $β_{0}$ used.

converged

Describes whether the algorithm converged.

Details

Rather than compute the posterior using MCMC, we turn to approximating it using variational inference. Within variational inference we re-cast Bayesian inference as an optimisation problem, where we minimize the Kullback-Leibler (KL) divergence between a family of tractable distributions and the posterior, $Π$ . In our case we use a mean-field variational family, $Q = {\prod_{j = 1}^{p} γ_{j} N (μ_{j}, s_{j}^{2}) + (1 - γ_{j}) δ_{0}}$ where $μ_{j}$ is the mean and $s_{j}$ the std. dev for the Gaussian component, $γ_{j}$ the inclusion probabilities, $δ_{0}$ a Dirac mass at zero and $p$ the number of coefficients. The components of the variational family ( $μ, s, γ$ ) are then optimised by minimizing the Kullback-Leibler divergence between the variational family and the posterior, $\tilde{Π} = \arg min K L (Q ‖ Π) .$ We use co-ordinate ascent variational inference (CAVI) to solve the above optimisation problem.

Examples

Run this code

# NOT RUN {
n <- 125                        # number of sample
p <- 250                        # number of features
s <- 5                          # number of non-zero coefficients
censoring_lvl <- 0.25           # degree of censoring


# generate some test data
set.seed(1)
b <- sample(c(runif(s, -2, 2), rep(0, p-s)))
X <- matrix(rnorm(n * p), nrow=n)
Y <- log(1 - runif(n)) / -exp(X %*% b)
delta  <- runif(n) > censoring_lvl   		# 0: censored, 1: uncensored
Y[!delta] <- Y[!delta] * runif(sum(!delta))	# rescale censored data


# fit the model
f <- survival.svb::svb.fit(Y, delta, X, mu.init=rep(0, p))

# }
# NOT RUN {
## Larger Example
n <- 250                        # number of sample
p <- 1000                       # number of features
s <- 10                         # number of non-zero coefficients
censoring_lvl <- 0.4            # degree of censoring


# generate some test data
set.seed(1)
b <- sample(c(runif(s, -2, 2), rep(0, p-s)))
X <- matrix(rnorm(n * p), nrow=n)
Y <- log(1 - runif(n)) / -exp(X %*% b)
delta  <- runif(n) > censoring_lvl   		# 0: censored, 1: uncensored
Y[!delta] <- Y[!delta] * runif(sum(!delta))	# rescale censored data


# fit the model
f <- survival.svb::svb.fit(Y, delta, X)


# plot the results
plot(b, xlab=expression(beta), main="Coefficient value", pch=8, ylim=c(-2,2))
points(f$beta_hat, pch=20, col=2)
legend("topleft", legend=c(expression(beta), expression(hat(beta))),
       pch=c(8, 20), col=c(1, 2))
plot(f$inclusion_prob, main="Inclusion Probabilities", ylab=expression(gamma))
# }