sns: Stochastic Newton Sampler (SNS)

Description

SNS is a Metropolis-Hastings MCMC sampler with a multivariate Gaussian proposal function resulting from a local, second-order Taylor series expansion of log-density. The mean of the Gaussian proposal is identical to the full Newton-Raphson step from the current point. During burn-in, Newton-Raphson optimization can be performed to get close to the mode of the pdf which is unique due to convexity, resulting in faster convergence. For high dimensional densities, state space partitioning can be used to improve mixing. Support for numerical differentiation is provided using numDeriv package. sns is the low-level function for drawing one sample from the distribution. For drawing multiple samples from a (fixed) distribution, consider using sns.run.

Usage

sns(x, fghEval, rnd = TRUE, gfit = NULL, mh.diag = FALSE
  , part = NULL, numderiv = 0
  , numderiv.method = c("Richardson", "simple")
  , numderiv.args = list(), ...)

Value

sns returns the sample drawn as a vector, with attributes:

accept: A boolean indicating whether the proposed point was accepted.
ll: Value of the log-density at the sampled point.
gfit: List containing Gaussian fit to pdf at the sampled point.

Arguments

x: Current state vector.
fghEval: Log-density to be sampled from. A valid log-density can have one of 3 forms: 1) return log-density, but no gradient or Hessian, 2) return a list of f and g for log-density and its gradient vector, respectively, 3) return a list of f, g, and h for log-density, gradient vector, and Hessian matrix. Missing derivatives are computed numerically.
rnd: Runs 1 iteration of Newton-Raphson optimization method (non-stochastic or 'nr' mode) when FALSE. Runs Metropolis-Hastings (stochastic or 'mcmc' mode) for drawing a sample when TRUE.
gfit: Gaussian fit at point init. If NULL then sns will compute a Gaussian fit at x.
mh.diag: Boolean flag, indicating whether detailed MH diagnostics such as components of acceptance test must be returned or not.
part: List describing partitioning of state space into subsets. Each element of the list must be an integer vector containing a set of indexes (between 1 and length(x) or length(init)) indicating which subset of all dimensions to jointly sample. These integer vectors must be mutually exclusive and collectively exhaustive, i.e. cover the entire state space and have no duplicates, in order for the partitioning to represent a valid Gibbs sampling approach. See sns.make.part and sns.check.part.
numderiv: Integer with value from the set 0,1,2. If 0, no numerical differentiation is performed, and thus fghEval is expected to supply f, g and h. If 1, we expect fghEval to provide f amd g, and Hessian will be calculated numerically. If 2, fghEval only returns log-density, and numerical differentiation is needed to calculate gradient and Hessian.
numderiv.method: Method used for numeric differentiation. This is passed to the grad and hessian functions in numDeriv package. See the package documentation for details.
numderiv.args: Arguments to the numeric differentiation method chosen in numderiv.method, passed to grad and hessian functions in numDeriv. See package documentation for details.
...: Other arguments to be passed to fghEval.

Author

Alireza S. Mahani, Asad Hasan, Marshall Jiang, Mansour T.A. Sharabiani

References

Mahani A.S., Hasan A., Jiang M. & Sharabiani M.T.A. (2016). Stochastic Newton Sampler: The R Package sns. Journal of Statistical Software, Code Snippets, 74(2), 1-33. doi:10.18637/jss.v074.c02

Hastings, W. K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57(1), 97-109.

Qi, Y., & Minka, T. P. (2002). Hessian-based markov chain monte-carlo algorithms. 1st Cape Cod Workshop on Monte Carlo Methods.

Examples

Run this code

if (FALSE) {

# using RegressionFactory for generating log-likelihood and its derivatives
library(RegressionFactory)

loglike.poisson <- function(beta, X, y) {
  regfac.expand.1par(beta, X = X, y = y,
                     fbase1 = fbase1.poisson.log)
}

# simulating data
K <- 5
N <- 1000
X <- matrix(runif(N * K, -0.5, +0.5), ncol = K)
beta <- runif(K, -0.5, +0.5)
y <- rpois(N, exp(X %*% beta))

beta.init <- rep(0.0, K)

# glm estimate, for reference
beta.glm <- glm(y ~ X - 1, family = "poisson",
                start = beta.init)$coefficients

# running SNS in non-stochastic mode
# this should produce results very close to glm
beta.sns <- beta.init
for (i in 1:20)
  beta.sns <- sns(beta.sns, loglike.poisson, X = X, y = y, rnd = F)

# comparison
all.equal(as.numeric(beta.glm), as.numeric(beta.sns))

# trying numerical differentiation
loglike.poisson.fonly <- function(beta, X, y) {
  regfac.expand.1par(beta, X = X, y = y, fgh = 0,
                     fbase1 = fbase1.poisson.log)
}

beta.sns.numderiv <- beta.init
for (i in 1:20)
  beta.sns.numderiv <- sns(beta.sns.numderiv, loglike.poisson.fonly
                  , X = X, y = y, rnd = F, numderiv = 2)
all.equal(as.numeric(beta.glm), as.numeric(beta.sns.numderiv))

# add numerical derivatives to fghEval outside sns
loglike.poisson.numaug <- sns.fghEval.numaug(loglike.poisson.fonly
  , numderiv = 2)

beta.sns.numaug <- beta.init
for (i in 1:20)
  # set numderiv to 0 to avoid repeating 
  # numerical augmentation inside sns
  beta.sns.numaug <- sns(beta.sns.numaug, loglike.poisson.numaug
                           , X = X, y = y, rnd = F, numderiv = 0)
all.equal(as.numeric(beta.glm), as.numeric(beta.sns.numaug))

}

Run the code above in your browser using DataLab