bmrm: Bayesian Analysis of Multivariate Receptor Modeling

Description

Generate posterior samples of the source composition matrix P, the source contribution matrix A, and the error variance \(\Sigma\) using 'JAGS', and computes estimates of A,P,\(\Sigma\).

Usage

bmrm(Y, q, muP,errdist="norm", df=4,
            varP.free=100, xi=NULL, Omega=NULL,
              a0=0.01, b0=0.01,
             nAdapt=1000, nBurnIn=5000, nIter=5000, nThin=1,
             P.init=NULL, A.init=NULL, Sigma.init=NULL,...)

Value

in bmrm object

nsource: number of sources
nobs: number of observations in data Y
nvar: number of variables in data Y
Y: observed data matrix
muP: prior mean of the source composition matrix P
errdist: error distribution
df: degrees of freedom when errdist="t"
A.hat: posterior mean of the source contribution matrix A
P.hat: posterior mean of the source composition matrix P
Sigma.hat: posterior mean of the error variance Sigma
A.sd: posterior standard deviation of the source contribution matrix A
P.sd: posterior standard deviation of the source composition matrix P
Sigma.sd: posterior standard deviation of the error variance Sigma
A.quantiles: posterior quantlies of A for prob=(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975)
P.quantiles: posterior quantiles of P for prob=(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975)
Sigma.quantiles: posterior quantiles of Sigma for prob=(0.025, 0.05, 0.25, 0.5, 0.75, 0.95, 0.975)
Y.hat: predicted value of Y computed from A.hat*P.hat
residual: Y-Y.hat
codaSamples: MCMC posterior samples of A, P, and \(\Sigma\) in class "mcmc.list"
nIter: number of MCMC iterations per chain for monitoring samples from MCMC
nBurnIn: number of iterations for the burn-in period in MCMC
nThin: thinning interval for monitoring samples from MCMC

Arguments

Y: data matrix
q: number of sources. It must be a positive integer.
muP: (q,ncol(Y))-dimensional prior mean matrix for the source composition matrix P, where q is the number of sources. Zeros need to be assigned to prespecified elements of muP to satisfy the identifiability condition C1. For the remaining free elements, any nonnegative numbers (between 0 and 1 preferably) can be assigned. If no or an insufficient number of zeros are preassigned in muP, estimation can still be performed but the resulting estimates may be subject to rotational ambiguities. (default=0.5 for nonzero elements ).
errdist: error distribution: either "norm" for normal distribution or "t" for t distribution (default="norm")
df: degrees of freedom of a t-distribution when errdist="t" (default=4)
varP.free: scalar value of the prior variance of the free (nonzero) elements of the source composition matrix P (default=100)
xi: prior mean vector of the q-dimensional source contribution vector at time t (default=vector of 1's)
Omega: diagonal matrix of the prior variance of the q-dimensional source contribution vector at time t (default=identity matrix)
a0: shape parameter of the Inverse Gamma prior of the error variance (default=0.01)
b0: scale parameter of the Inverse Gamma prior of the error variance (default=0.01)
nAdapt: number of iterations for adaptation in 'JAGS' (default=1000)
nBurnIn: number of iterations for the burn-in period in MCMC (default=5000)
nIter: number of iterations for monitoring samples from MCMC (default=5000). nIter samples are saved in each chain of MCMC.
nThin: thinning interval for monitoring samples from MCMC (default=1)
P.init: initial value of the source composition matrix P. If omitted, zeros are assigned to the elements corresponding to zero elements in muP and the nonzero elements of P.init will be randomly generated from a uniform distrbution.
A.init: initial value of the source contribution matrix A. If omitted, it will be calculated from Y and P.init.
Sigma.init: initial value of the error variance. If omitted, it will be calculated from Y, A.init and P.init.
...: arguments to be passed to methods

Details

Model

The basic model for Bayesian multivariate receptor model is as follows:

\(Y_t=A_t P+E_t, t=1,\cdots,T\),

where

\(Y_t\) is a vector of observations of \(J\) variables at time \(t\), \(t = 1,\cdots,T\).
\(P\) is a \(q \times J\) source composition matrix in which the \(k\)-th row represents the \(k\)-th source composition profiles, \(k=1,\cdots,q\), \(q\) is the number of sources.
\(A_t\) is a \(q\) dimensional source contribution vector at time \(t\), \(t=1,\cdots,T\).
\(E_t =(E_{t1}, \cdots, E_{tJ})\) is an error term for the \(t\)-th observations, following \(E_{t} \sim N(0, \Sigma)\) or \(E_{t} \sim t_{df}(0, \Sigma)\), independently for \(j = 1,\cdots,J\), where \(\Sigma = diag(\sigma_{1}^2,..., \sigma_{J}^2)\).

Priors

Prior distribution of \(A_t\) is given as a truncated multivariate normal distribution,
- \( A_t \sim N(\xi,\Omega) I(A_t \ge 0)\), independently for \(t = 1,\cdots,T\).
Prior distribution of \(P_{kj}\) (the \((k,j)\)-th element of the source composition matrix \(P\)) is given as
- \( P_{kj} \sim N(\code{muP}_{kj} , \code{varP.free} )I(P_{kj} \ge 0)\), for free (nonzero) \(P_{kj}\),
- \( P_{kj} \sim N(0, 1e-10 )I(P_{kj} \ge 0)\), for zero \(P_{kj}\),
  
  independently for \(k = 1,\cdots,q; j = 1,\cdots,J \).
Prior distribution of \(\sigma_j^2\) is \(IG(a0, b0)\), i.e.,
- \(1/\sigma_j^2 \sim Gamma(a0, b0)\), having mean \(a0/b0\), independently for \(j=1,...,J\).

Notes

We use the prior \( P_{kj} \sim N(0, 1e-10 )I(P_{kj} \ge 0)\) that is practically equal to the point mass at 0 to simplify the model building in 'JAGS'.
The MCMC samples of A and P are post-processed (rescaled) before saving so that \( \sum_{j=1}^J P_{kj} =1\) for each \(k=1,...,q\) (the identifiablity condition C3 of Park and Oh (2015).

References

Park, E.S. and Oh, M-S. (2015), Robust Bayesian Multivariate Receptor Modeling, Chemometrics and intelligent laboratory systems, 149, 215-226.

Plummer, M. 2003. JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling. Proceedings of the 3rd international workshop on distributed statistical computing, pp. 125. Technische Universit at Wien, Wien, Austria.

Plummer, M. 2015. 'JAGS' Version 4.0.0 user manual.

Examples

Run this code

# \donttest{
data(Elpaso); Y=Elpaso$Y ; muP=Elpaso$muP ; q=nrow(muP)
out.Elpaso <- bmrm(Y,q,muP)
summary(out.Elpaso)
plot(out.Elpaso)
# }

Run the code above in your browser using DataLab