choicemodelr: Choice Modeling in R

Description

Estimates coefficients of a Hierarchical Bayes Multinomial Logit Model

Usage

choicemodelr(data, xcoding, demos, prior, mcmc, constraints, options)

Arguments

data

Required. A data frame. The column variables of the data frame are as follows, where natts is the number of attributes; i.e., independent variables: UnitID Set Alt X_1 ... X_natts y The first column contains the ID of the unit (e.g. cus

xcoding

Required. A vector that specifies the way in which each attribute will be coded: 0 = categorical (effects coded) 1 = continuous (the program mean centers the variable across the levels appearing in the data) The order of attributes in xcoding must

demos

An ni by nz matrix of demographic variables or unit characteristics, where ni is the number of units and nz is the number of unit-level demographic or descriptor variables.

prior

list(mubar, Amu, df, v, deltabar, Ad) mubar = prior mean of the distribution of mu; must be a vector of length equal to the number of attributes (default is a vector of zeros) Amu = precision parameter (default is 0.01) df = prior degrees of fr

mcmc

Required. A list with 3 arguments: list(R, use, s). R = total number of iterations of the Markov chain Monte Carlo (MCMC chain) to be performed (R is required). use = the number of iterations to be used in parameter estimation (use is required).

constraints

A list of matrices containing the values 0, 1, and -1. If specifying constraints, a constraints matrix must be specified for EVERY attribute. Simply declare a matrix of 0s for an unconstrained attribute. Each matrix must be square with dimensions equa

options

A list with 5 possible arguments: list(none, save, keep, wgt, restart). none: set to TRUE to estimate a none parameter, and the data does not include a row for none (i.e., no choice) (default is FALSE). save: set to T

Value

betadrawAn ni by natt by floor(use/keep) array of MCMC random draws of unit-level multinomial logit model parameter estimates.
betadraw.cAn ni by natt by floor(use/keep) array of constrained MCMC random draws of unit-level multinomial logit model parameter estimates.
deltadrawA floor(use/keep) by nz*natt array of MCMC random draws of parameter estimates on covariates to the distribution of heterogeneity.
compdrawA list of floor(use/keep) MCMC random draws of estimates of means and roots for the multivariate normal distribution of heterogeneity.
loglikeA floor(use/keep) vector of likelihoods for the MCMC draws of multinomial logit parameters.
Written to Console During Model EstimationDuring model estimation, the following statistics are written to the screen after each 100 iterations. The selection of these particular statistics was suggested by Sawtooth Software's technical paper, The CBC/HB System for Hierarchical Bayes Estimation, Version 5.0 Technical Paper (2009). Following Sawtooth Software's approach for certain statistics, we use a weighted average with a weight of 0.01 for the last 100 iterations and 0.99 for previous iterations.
AcceptancePercent of MCMC draws accepted in the Metropolis Hastings step.
RLHnth root of the likelihood, where n is the average number of choice tasks (weighted average).
Percent CertaintyPercent difference between log likelihood and log likelihood of a chance model (weighted average).
Average VarianceAverage variance of latest estimates of model coefficients across all units (weighted average).
RMSRoot mean squared of latest estimates of model coefficients across all units (weighted average).
Graphic OutputDuring model estimation, estimates of mu (mean of model coefficients from the distribution of heterogeneity) are plotted in the graphics window.
Written to DiskAt the end of model estimation, average of MCMC draws of unit-level model coefficients are written to Xbetas.csv. A log file, documenting run-time output is written to Rlog.txt. Latest MCMC draws are written to restart.txt.

Details

Model: lll{ Y_ij ~ MNL(beta_i*X_ij) for all i units and choice sets j (X_ij is nvar by 1, where nvar is the number of independent variables) beta_i = Z_i'delta + u_i (beta_i is 1 by nvar) Z_i = a column vector (nz by 1) of unit characteristics variables delta = a matrix (nz by nvar) of parameters where each column corresponds to a column of beta_i u_i ~ N(mu,Sigma), a multivariate normal distribution mu = a vector of means of the distribution of heterogeneity of length nvar Sigma = Covariance matrix of the distribution of heterogeneity} Priors: lll{ delta ~ N(deltabar, inverse(A_d)) mu ~ N(mubar, inverse(SigmaAmu) Sigma_j ~ IW(nu,V) deltabar = nz by nvar vector of prior means = 0 Ad = prior precision matrix for deltabar = .01I mubar = nvar by 1 prior mean vector for mu = vector of zeros nu = nuI is the degrees of freedom parameter for IW prior for Sigma V = location parameter for IW prior for Sigma Amu = prior precision for normal mean = .01}

References

Rossi, Peter; Allenby, Greg M.; and McCulloch, Robert (2005), Bayesian Statistics and Marketing, John Wiley and Sons. Sawtooth Software (2009), The CBC/HB System for Hierarchical Bayes Estimation, Version 5.0 Technical Paper, www.sawtoothsoftware.com.

Examples

Run this code

# EXAMPLE 1: MULTINOMIAL LOGIT

# LOAD ARTIFICIAL (SIMULATED) DATA THAT WAS CREATED 
# BY R CODE FOUND IN datar SECTION OF THE HELP FILES.

data(datar)
data(truebetas)

# USE choicemodelr TO ESTIMATE THE PARAMETERS OF THE CHOICE MODEL.
# FOR CONVERGENCE OF MCMC CHAIN, SET R = 4000 AND use = 2000.

xcoding = c(0, 0)
mcmc = list(R = 10, use = 10)

options = list(none=FALSE, save=TRUE, keep=1)

attlevels = c(5, 3)
constype =  c(0, 1)
constraints = vector("list", 2)

for (i in 1:length(attlevels)) {
	constraints[[i]] = diag(0, attlevels[i])
	if (constype[i] == 1) {
		constraints[[i]][upper.tri(constraints[[i]])] = -1
	}
	else if (constype[i] == 2) {
		constraints[[i]][upper.tri(constraints[[i]])] = 1
	}
}

out = choicemodelr(datar, xcoding, mcmc = mcmc, options = options, constraints = constraints)

# CALCULATE MEAN ABSOLUTE ERROR BETWEEN ESTIMATED AND TRUE BETAS.
estbetas = apply(out$betadraw.c,c(1,2),mean)
estbetas = cbind(estbetas[,1:4],0-apply(estbetas[,1:4],1,sum),estbetas[,5:6],0-apply(estbetas[,5:6],1,sum))
colnames(estbetas) = c("A1B1", "A1B2", "A1B3", "A1B4", "A1B5", "A2B1", "A2B2", "A2B3")

MAE = mean(abs(estbetas - truebetas))
print(MAE)

# CALCULATE MEAN ABSOLUTE ERROR BETWEEN PROBABILITY
# DIFFERENCES USING ESTIMATED AND TRUE BETAS. 

TrueProb = cbind(exp(truebetas[,1:5]) / apply(exp(truebetas[,1:5]),1,sum),
                 exp(truebetas[,6:8]) / apply(exp(truebetas[,6:8]),1,sum))
EstProb = cbind(exp(estbetas[,1:5]) / apply(exp(estbetas[,1:5]),1,sum),
                exp(estbetas[,6:8]) / apply(exp(estbetas[,6:8]),1,sum))
MAEProb = mean(abs(TrueProb - EstProb))

print(MAEProb)


# EXAMPLE 2: FRACTIONAL MULTINOMIAL LOGIT

# LOAD ARTIFICIAL (SIMULATED) FRACTIONAL MULTINOMIAL LOGIT DATA CREATED 
# BY R CODE FOUND IN sharedatar SECTION OF THE HELP FILES.

data(sharedatar)
data(truebetas)

# USE choicemodelr TO ESTIMATE THE PARAMETERS OF THE CHOICE MODEL.
# FOR CONVERGENCE OF MCMC CHAIN, SET R = 2000 AND use = 1000.

xcoding = c(0, 0)
mcmc = list(R = 10, use = 10)

options = list(none=FALSE, save=TRUE, keep=1)

attlevels = c(5, 3)
constype =  c(0, 1)
constraints = vector("list", 2)

for (i in 1:length(attlevels)) {
	constraints[[i]] = diag(0, attlevels[i])
	if (constype[i] == 1) {
		constraints[[i]][upper.tri(constraints[[i]])] = -1
	}
	else if (constype[i] == 2) {
		constraints[[i]][upper.tri(constraints[[i]])] = 1
	}
}

out = choicemodelr(sharedatar, xcoding, mcmc = mcmc, options = options, constraints = constraints)

# CALCULATE MEAN ABSOLUTE ERROR BETWEEN ESTIMATED AND TRUE BETAS.
estbetas = apply(out$betadraw.c,c(1,2),mean)
estbetas = cbind(estbetas[,1:4],0-apply(estbetas[,1:4],1,sum),estbetas[,5:6],0-apply(estbetas[,5:6],1,sum))
colnames(estbetas) = c("A1B1", "A1B2", "A1B3", "A1B4", "A1B5", "A2B1", "A2B2", "A2B3")

MAE = mean(abs(estbetas - truebetas))
print(MAE)

# CALCULATE MEAN ABSOLUTE ERROR BETWEEN PROBABILITY
# DIFFERENCES USING ESTIMATED AND TRUE BETAS. 

TrueProb = cbind(exp(truebetas[,1:5]) / apply(exp(truebetas[,1:5]),1,sum),
                 exp(truebetas[,6:8]) / apply(exp(truebetas[,6:8]),1,sum))
EstProb = cbind(exp(estbetas[,1:5]) / apply(exp(estbetas[,1:5]),1,sum),
                exp(estbetas[,6:8]) / apply(exp(estbetas[,6:8]),1,sum))
MAEProb = mean(abs(TrueProb - EstProb))

print(MAEProb)

Run the code above in your browser using DataLab