rbayesBLP: Bayesian Analysis of Random Coefficient Logit Models Using Aggregate Data

Description

rbayesBLP implements a hybrid MCMC algorithm for aggregate level sales data in a market with differentiated products. Version 3.0-1 contains an error for use of instruments with this function. This will be fixed in version 3.0-2.

Usage

rbayesBLP(Data, Prior, Mcmc)

Arguments

Data

list(X,share,J,Z) (X, share, and J: required).

Prior

list(sigmasqR,theta_hat,A,deltabar,Ad,nu0,s0_sq,VOmega) (optional).

Mcmc

list(R,H,initial_theta_bar,initial_r,initial_tau_sq,initial_Omega,initial_delta,s,cand_cov,tol,keep,nprint) (R and H: required).

Value

a list containing

thetabardraw

K by R/keep matrix of random coefficient mean draws

Sigmadraw

K*K by R/keep matrix of random coefficient variance draws

rdraw

K*K by R/keep matrix of \(r\) draws (same information as in Sigmadraw)

tausqdraw

R/keep vector of aggregate demand shock variance draws

Omegadraw

2*2 by R/keep matrix of correlated endogenous shock variance draws

deltadraw

I by R/keep matrix of endogenous structural equation coefficient draws

acceptrate

scalor of acceptance rate of Metropolis-Hasting

scale parameter used for Metropolis-Hasting

cand_cov

var-cov matrix used for Metropolis-Hasting

Details

Model: \(u_ijt = X_jt \theta_i + \eta_jt + e_ijt\) \(e_ijt\) \(\sim\) type I Extreme Value (logit) \(\theta_i\) \(\sim\) \(N(\theta_bar, \Sigma)\) \(\eta_jt\) \(\sim\) \(N(0, \tau_sq)\) This structure implies a logit model for each consumer (\(\theta\)). Aggregate shares share are produced by integrating this consumer level logit model over the assumed normal distribution of \(\theta\).

Priors: \(r\) \(\sim\) \(N(0,diag(sigmasqR))\). \(\theta_bar\) \(\sim\) \(N(\theta_hat,A^-1)\). \(\tau_sq\) \(\sim\) \(nu0*s0_sq / \chi^2 (nu0)\)

Note: we observe the aggregate level market share, not individual level choice.

Note: \(r\) is the vector of nonzero elements of cholesky root of \(\Sigma\). Instead of \(\Sigma\) we draw \(r\), which is one-to-one correspondence with the positive-definite \(\Sigma\).

Model (with IV): \(u_ijt = X_jt \theta_i + \eta_jt + e_ijt\) \(e_ijt\) \(\sim\) type I Extreme Value (logit) \(\theta_i\) \(\sim\) \(N(\theta_bar, \Sigma)\)

\(X_jt = [X_exo_jt, X_endo_jt]\) \(X_endo_jt = Z_jt \delta_jt + \zeta_jt\) \(vec(\zeta_jt, \eta_jt)\) \(\sim\) \(N(0, \Omega)\)

Priors (with IV): \(r\) \(\sim\) \(N(0,diag(sigmasqR))\). \(\theta_bar\) \(\sim\) \(N(\theta_hat,A^-1)\). \(\delta\) \(\sim\) \(N(deltabar,Ad^-1)\). \(\Omega\) \(\sim\) \(IW(nu0, VOmega)\)

Step 1 (\(\Sigma\)): Given \(\theta_bar\) and \(\tau_sq\), draw \(r\) via Metropolis-Hasting. Covert the drawn \(r\) to \(\Sigma\).

Note: if user does not specify the Metropolis-Hasting increment parameters (s and cand_cov), rbayesBLP automatically tunes the parameters.

Step 2 (\(\theta_bar\), \(\tau_sq\)): Given \(\Sigma\), draw \(\theta_bar\) and \(\tau_sq\) via Gibbs sampler.

Step 2 (with IV: \(\theta_bar\), \(\delta\), \(\Omega\)): Given \(\Sigma\), draw \(\theta_bar\), \(\delta\), and \(\Omega\) via IV Gibbs sampler.

List arguments contain:

Data

J number of alternatives without outside option
X J*T by K matrix (no outside option, which is normalized to 0). If IV is used, the last column is endogeneous variable.
share J*T vector (no outside option)
Z J*T by I matrix of instrumental variables (optional)

Note: both the share vector and the X matrix are organized by the jt index. j varies faster than t, i.e. (j=1,t=1),(j=2,t=1), ..., (j=J,T=1), ..., (j=J,t=T)

Prior

sigmasqR K*(K+1)/2 vector for \(r\) prior variance (def: diffuse prior for \(\Sigma\))
theta_hat K vector for \(\theta_bar\) prior mean (def: 0 vector)
A K by K matrix for \(\theta_bar\) prior precision (def: 0.01*diag(K))
deltabar I vector for \(\delta\) prior mean (def: 0 vector)
Ad I by I matrix for \(\delta\) prior precision (def: 0.01*diag(I))
nu0 d.f. parameter for \(\tau_sq\) and \(\Omega\) prior (def: K+1)
s0_sq scale parameter for \(\tau_sq\) prior (def: 1)
VOmega 2 by 2 matrix parameter for \(\Omega\) prior (def: matrix(c(1,0.5,0.5,1),2,2))

Mcmc

R number of MCMC draws
H number of random draws used for Monte-Carlo integration
initial_theta_bar initial value of \(\theta_bar\) (def: 0 vector)
initial_r initial value of \(r\) (def: 0 vector)
initial_tau_sq initial value of \(\tau_sq\) (def: 0.1)
initial_Omega initial value of \(\Omega\) (def: diag(2))
initial_delta initial value of \(\delta\) (def: 0 vector)
s scale parameter of Metropolis-Hasting increment (def: automatically tuned)
cand_cov var-cov matrix of Metropolis-Hasting increment (def: automatically tuned)
tol convergence tolerance for the contraction mapping (def: 1e-6)
keep MCMC thinning parameter: keep every keepth draw (def: 1)
nprint print the estimated time remaining for every nprint'th draw (def: 100)

Tuning Metropolis-Hastings algorithm:

r_cand = r_old + s*N(0,cand_cov) Fix the candidate covariance matrix as cand_cov0 = diag(rep(0.1, K), rep(1, K*(K-1)/2)). Start from s0 = 2.38/sqrt(dim(r))

Repeat{ Run 500 MCMC chain. If acceptance rate < 30% => update s1 = s0/5. If acceptance rate > 50% => update s1 = s0*3. (Store r draws if acceptance rate is 20~80%.) s0 = s1 } until acceptance rate is 30~50%

Scale matrix C = s1*sqrt(cand_cov0) Correlation matrix R = Corr(r draws) Use C*R*C as s^2*cand_cov.

References

For further discussion, see Bayesian Analysis of Random Coefficient Logit Models Using Aggregate Data by Jiang, Manchanda and Rossi, Journal of Econometrics, 2009. http://www.sciencedirect.com/science/article/pii/S0304407608002297

Examples

Run this code


if(nchar(Sys.getenv("LONG_TEST")) != 0) {
###
### Simulate aggregate level data
###
simulData <- function(para, others, Hbatch){
  #
  # Keunwoo Kim, UCLA Anderson
  #
  ### parameters
  theta_bar <- para$theta_bar
  Sigma <- para$Sigma
  tau_sq <- para$tau_sq
	
  T <- others$T	
  J <- others$J	
  p <- others$p	
  H <- others$H	
  K <- J + p	
  
  # Hbatch does the integration for computing market shares in batches of
  #        size Hbatch

  ### build X	
  X <- matrix(runif(T*J*p), T*J, p)
  inter <- NULL
  for (t in 1:T){
    inter <- rbind(inter, diag(J))
  }
  X <- cbind(inter, X)

  ### draw eta ~ N(0, tau_sq)	
  eta <- rnorm(T*J)*sqrt(tau_sq)
  X <- cbind(X, eta)
	
  share <- rep(0, J*T)
  for (HH in 1:(H/Hbatch)){
    ### draw theta ~ N(theta_bar, Sigma)
    cho <- chol(Sigma)
    theta <- matrix(rnorm(K*Hbatch), nrow=K, ncol=Hbatch)
    theta <- t(cho)%*%theta + theta_bar

    ### utility
    V <- X%*%rbind(theta, 1)
    expV <- exp(V)
    expSum <- matrix(colSums(matrix(expV, J, T*Hbatch)), T, Hbatch)
    expSum <- expSum %x% matrix(1, J, 1)
    choiceProb <- expV / (1 + expSum)
    share <- share +  rowSums(choiceProb) / H
  }	
	
  ### the last K+1'th column is eta, which is unobservable.
  X	<- X[,c(1:K)]	
  return (list(X=X, share=share))
}

### true parameter
theta_bar_true <- c(-2, -3, -4, -5)
Sigma_true <- rbind(c(3,2,1.5,1),c(2,4,-1,1.5),c(1.5,-1,4,-0.5),c(1,1.5,-0.5,3))
cho <- chol(Sigma_true)
r_true <- c(log(diag(cho)),cho[1,2:4],cho[2,3:4],cho[3,4]) 
tau_sq_true <- 1

### simulate data
set.seed(66)
T <- 300;J <- 3;p <- 1;K <- 4;H <- 1000000;Hbatch <- 5000
dat <- simulData(para=list(theta_bar=theta_bar_true, Sigma=Sigma_true, tau_sq=tau_sq_true),
        others=list(T=T, J=J, p=p, H=H), Hbatch)
X <- dat$X
share <- dat$share

### Mcmc run
R <- 2000;H <- 50
Data1 <- list(X=X, share=share, J=J)
Mcmc1 <- list(R=R, H=H, nprint=0)
set.seed(66)
out <- rbayesBLP(Data=Data1, Mcmc=Mcmc1)

### acceptance rate
out$acceptrate

### summary of draws
summary(out$thetabardraw)
summary(out$Sigmadraw)
summary(out$tausqdraw)

### plotting draws
plot(out$thetabardraw)
plot(out$Sigmadraw)
plot(out$tausqdraw)
}

Run the code above in your browser using DataLab