rhierLinearMixtureParallel: MCMC Algorithm for Hierarchical Multinomial Linear Model with Mixture-of-Normals Heterogeneity

Description

rhierLinearMixtureParallel implements a MCMC algorithm for hierarchical linear model with a mixture of normals heterogeneity distribution.

Usage

rhierLinearMixtureParallel(Data, Prior, Mcmc, verbose = FALSE)

Value

A list of sharded partitions where each partition contains the following:

compdraw: A list (length: R/keep) where each list contains `mu` (vector, length: `ncomp`) and `rooti` (matrix, shape: ncomp $\times$ ncomp)
probdraw: A $(R/keep) \times (ncomp)$ matrix that reports the probability that each draw came from a particular component
Deltadraw: A $(R/keep) \times (nz \times nvar)$ matrix of draws of Delta, first row is initial value. Deltadraw is NULL if Z is NULL

Arguments

Data: A list containing: `regdata` - A $nreg$ size list of multinomial regdata, and optional `Z`- $nreg \times nz$ matrix of unit characteristics.
Prior: A list with one required parameter: `ncomp`, and optional parameters: `deltabar`, `Ad`, `mubar`, `Amu`, `nu`, `V`, `nu.e`, and `ssq`.
Mcmc: A list with one required parameter: `R`-number of iterations, and optional parameters: `keep` and `nprint`.
verbose: If TRUE, enumerates model parameters and timing information.

Author

Federico Bumbaca, Leeds School of Business, University of Colorado Boulder, federico.bumbaca@colorado.edu

Details

Model and Priors

nreg regression equations with nvar as the number of $X$ vars in each equation
$y_i = X_i\beta_i + e_i$ with $e_i$ $\sim$ $N(0, \tau_i)$

$\tau_i$ $\sim$ $nu.e*ssq_i/\chi^2_{nu.e}$ where $\tau_i$ is the variance of $e_i$
$B = Z\Delta + U$ or $\beta_i = \Delta' Z[i,]' + u_i$
$\Delta$ is an $nz \times nvar$ matrix

$Z$ should not include an intercept and should be centered for ease of interpretation. The mean of each of the nreg $\beta$s is the mean of the normal mixture. Use summary() to compute this mean from the compdraw output.

$u_i$ $\sim$ $N(\mu_{ind}, \Sigma_{ind})$
$ind$ $\sim$ $multinomial(pvec)$

$pvec$ $\sim$ $dirichlet(a)$
$delta = vec(\Delta)$ $\sim$ $N(deltabar, A_d^{-1})$
$\mu_j$ $\sim$ $N(mubar, \Sigma_j(x) Amu^{-1})$
$\Sigma_j$ $\sim$ $IW(nu, V)$

Be careful in assessing the prior parameter Amu: 0.01 can be too small for some applications. See chapter 5 of Rossi et al for full discussion.

Argument Details

Data = list(regdata, Z) [Z optional]

`regdata:`	A $nreg$ size list of regdata
`regdata[[i]]$X:`	$n_i \times nvar$ design matrix for equation $i$
`regdata[[i]]$y:`	$n_i \times 1$ vector of observations for equation $i$
`Z:`	An $(nreg) \times nz$ matrix of unit characteristics

Prior = list(deltabar, Ad, mubar, Amu, nu.e, V, ssq, ncomp) [all but ncomp are optional]

`deltabar:`	$(nz \times nvar) \times 1$ vector of prior means (def: 0)
`Ad:`	prior precision matrix for vec(D) (def: 0.01*I)
`mubar:`	$nvar \times 1$ prior mean vector for normal component mean (def: 0)
`Amu:`	prior precision for normal component mean (def: 0.01)
`nu.e:`	d.f. parameter for regression error variance prior (def: 3)
`V:`	PDS location parameter for IW prior on normal component Sigma (def: nu*I)
`ssq:`	scale parameter for regression error variance prior (def: `var(y_i)`)
`ncomp:`	number of components used in normal mixture

Mcmc = list(R, keep, nprint) [only R required]

`R:`	number of MCMC draws
`keep:`	MCMC thinning parameter -- keep every `keep`th draw (def: 1)
`nprint:`	print the estimated time remaining for every `nprint`'th draw (def: 100, set to 0 for no print)

References

Bumbaca, Federico (Rico), Sanjog Misra, and Peter E. Rossi (2020), "Scalable Target Marketing: Distributed Markov Chain Monte Carlo for Bayesian Hierarchical Models", Journal of Marketing Research, 57(6), 999-1018.

Chapter 5, Bayesian Statistics and Marketing by Rossi, Allenby, and McCulloch.

Examples

Run this code



######### Single Component with rhierLinearMixtureParallel########
R = 500

nreg=1000
nobs=5 #number of observations
nvar=3 #columns
nz=2

Z=matrix(runif(nreg*nz),ncol=nz) 
Z=t(t(Z)-apply(Z,2,mean))

Delta=matrix(c(1,-1,2,0,1,0),ncol=nz)
tau0=.1
iota=c(rep(1,nobs)) 

#Default
tcomps=NULL
a=matrix(c(1,0,0,0.5773503,1.1547005,0,-0.4082483,0.4082483,1.2247449),ncol=3) 
tcomps[[1]]=list(mu=c(0,-1,-2),rooti=a) 
tpvec=c(1)                               

regdata=NULL						  
betas=matrix(double(nreg*nvar),ncol=nvar) 
tind=double(nreg) 

for (reg in 1:nreg) {
 tempout=bayesm::rmixture(1,tpvec,tcomps) 
 if (is.null(Z)){
   betas[reg,]= as.vector(tempout$x)  
 }else{
   betas[reg,]=Delta %*% Z[reg,]+as.vector(tempout$x)} 
 tind[reg]=tempout$z
 X=cbind(iota,matrix(runif(nobs*(nvar-1)),ncol=(nvar-1))) 
 tau=tau0*runif(1,min=0.5,max=1) 
 y=X %*% betas[reg,]+sqrt(tau)*rnorm(nobs)
 regdata[[reg]]=list(y=y,X=X,beta=betas[reg,],tau=tau)
}


Data1=list(list(regdata=regdata,Z=Z))
s = 1
Data2=scalablebayesm::partition_data(Data1,s=s)

Prior1=list(ncomp=1)
Mcmc1=list(R=R,keep=1)

out2 = parallel::mclapply(Data2, FUN = rhierLinearMixtureParallel, Prior = Prior1,
Mcmc = Mcmc1, mc.cores = s, mc.set.seed = FALSE)

######### Multiple Components with rhierLinearMixtureParallel########
R = 500

set.seed(66)
nreg=1000
nobs=5 #number of observations
nvar=3 #columns
nz=2

Z=matrix(runif(nreg*nz),ncol=nz) 
Z=t(t(Z)-apply(Z,2,mean))

Delta=matrix(c(1,-1,2,0,1,0),ncol=nz)
tau0=.1
iota=c(rep(1,nobs)) 

#Default
tcomps=NULL
a=matrix(c(1,0,0,0.5773503,1.1547005,0,-0.4082483,0.4082483,1.2247449),ncol=3) 
tcomps[[1]]=list(mu=c(0,-1,-2),rooti=a) 
tcomps[[2]]=list(mu=c(0,-1,-2)*2,rooti=a)
tcomps[[3]]=list(mu=c(0,-1,-2)*4,rooti=a)
tpvec=c(.4,.2,.4)                                   

regdata=NULL						  
betas=matrix(double(nreg*nvar),ncol=nvar) 
tind=double(nreg) 

for (reg in 1:nreg) {
 tempout=bayesm::rmixture(1,tpvec,tcomps) 
 if (is.null(Z)){
   betas[reg,]= as.vector(tempout$x)  
 }else{
   betas[reg,]=Delta %*% Z[reg,]+as.vector(tempout$x)} 
 tind[reg]=tempout$z
 X=cbind(iota,matrix(runif(nobs*(nvar-1)),ncol=(nvar-1))) 
 tau=tau0*runif(1,min=0.5,max=1) 
 y=X %*% betas[reg,]+sqrt(tau)*rnorm(nobs)
 regdata[[reg]]=list(y=y,X=X,beta=betas[reg,],tau=tau)
}


Data1=list(list(regdata=regdata,Z=Z))
s = 1
Data2=scalablebayesm::partition_data(Data1, s=s)

Prior1=list(ncomp=3)
Mcmc1=list(R=R,keep=1)

set.seed(1)
out2 = parallel::mclapply(Data2, FUN = rhierLinearMixtureParallel, Prior = Prior1,
Mcmc = Mcmc1, mc.cores = s, mc.set.seed = FALSE)

Run the code above in your browser using DataLab

`regdata:`	A \(nreg\) size list of regdata
`regdata[[i]]$X:`	\(n_i \times nvar\) design matrix for equation \(i\)
`regdata[[i]]$y:`	\(n_i \times 1\) vector of observations for equation \(i\)
`Z:`	An \((nreg) \times nz\) matrix of unit characteristics

`deltabar:`	\((nz \times nvar) \times 1\) vector of prior means (def: 0)
`Ad:`	prior precision matrix for vec(D) (def: 0.01*I)
`mubar:`	\(nvar \times 1\) prior mean vector for normal component mean (def: 0)
`Amu:`	prior precision for normal component mean (def: 0.01)
`nu.e:`	d.f. parameter for regression error variance prior (def: 3)
`V:`	PDS location parameter for IW prior on normal component Sigma (def: nu*I)
`ssq:`	scale parameter for regression error variance prior (def: `var(y_i)`)
`ncomp:`	number of components used in normal mixture