This function generates a sample from the posterior distribution of a mixed data (both continuous and ordinal) factor analysis model. Normal priors are assumed on the factor loadings and factor scores, improper uniform priors are assumed on the cutpoints, and inverse gamma priors are assumed for the error variances (uniquenesses). The user supplies data and parameters for the prior distributions, and a sample from the posterior distribution is returned as an mcmc object, which can be subsequently analyzed with functions provided in the coda package.
MCMCmixfactanal(x, factors, lambda.constraints = list(),
data = parent.frame(), burnin = 1000, mcmc = 20000, thin = 1,
tune = NA, verbose = 0, seed = NA, lambda.start = NA,
psi.start = NA, l0 = 0, L0 = 0, a0 = 0.001, b0 = 0.001,
store.lambda = TRUE, store.scores = FALSE, std.mean = TRUE,
std.var = TRUE, ...)
A one-sided formula containing the manifest variables. Ordinal
(including dichotomous) variables must be coded as ordered factors. Each
level of these ordered factors must be present in the data passed to the
function. NOTE: data input is different in MCMCmixfactanal
than in
either MCMCfactanal
or MCMCordfactanal
.
The number of factors to be fitted.
List of lists specifying possible equality or
simple inequality constraints on the factor loadings. A typical entry in the
list has one of three forms: varname=list(d,c)
which will constrain
the dth loading for the variable named varname to be equal to c,
varname=list(d,"+")
which will constrain the dth loading for the
variable named varname to be positive, and varname=list(d, "-")
which
will constrain the dth loading for the variable named varname to be
negative. If x is a matrix without column names defaults names of ``V1",
``V2", ... , etc will be used. Note that, unlike MCMCfactanal
, the
factors
+1 columns. The
first column of
A data frame.
The number of burn-in iterations for the sampler.
The number of iterations for the sampler.
The thinning interval used in the simulation. The number of iterations must be divisible by this value.
The tuning parameter for the Metropolis-Hastings sampling. Can
be either a scalar or a tune
must be strictly positive.
A switch which determines whether or not the progress of the
sampler is printed to the screen. If verbose
is great than 0 the
iteration number and the Metropolis-Hastings acceptance rate are printed to
the screen every verbose
th iteration.
The seed for the random number generator. If NA, the Mersenne
Twister generator is used with default seed 12345; if an integer is passed
it is used to seed the Mersenne twister. The user can also pass a list of
length two to use the L'Ecuyer random number generator, which is suitable
for parallel computation. The first element of the list is the L'Ecuyer
seed, which is a vector of length six or NA (if NA a default seed of
rep(12345,6)
is used). The second element of list is a positive
substream number. See the MCMCpack specification for more details.
Starting values for the factor loading matrix Lambda. If
lambda.start
is set to a scalar the starting value for all
unconstrained loadings will be set to that scalar. If lambda.start
is
a matrix of the same dimensions as Lambda then the lambda.start
matrix is used as the starting values (except for equality-constrained
elements). If lambda.start
is set to NA
(the default) then
starting values for unconstrained elements in the first column of Lambda are
based on the observed response pattern, the remaining unconstrained elements
of Lambda are set to 0, and starting values for inequality constrained
elements are set to either 1.0 or -1.0 depending on the nature of the
constraints.
Starting values for the error variance (uniqueness) matrix.
If psi.start
is set to a scalar then the starting value for all
diagonal elements of Psi
that represent error variances for
continuous variables are set to this value. If psi.start
is a
Psi
has psi.start
on the main
diagonal with the exception that entries corresponding to error variances
for ordinal variables are set to 1.. If psi.start
is set to NA
(the default) the starting values of all the continuous variable
uniquenesses are set to 0.5. Error variances for ordinal response variables
are always constrained (regardless of the value of psi.start
to have
an error variance of 1 in order to achieve identification.
The means of the independent Normal prior on the factor loadings.
Can be either a scalar or a matrix with the same dimensions as
Lambda
.
The precisions (inverse variances) of the independent Normal prior
on the factor loadings. Can be either a scalar or a matrix with the same
dimensions as Lambda
.
Controls the shape of the inverse Gamma prior on the uniqueness.
The actual shape parameter is set to a0/2
. Can be either a scalar or
a
Controls the scale of the inverse Gamma prior on the uniquenesses.
The actual scale parameter is set to b0/2
. Can be either a scalar or
a
A switch that determines whether or not to store the factor loadings for posterior analysis. By default, the factor loadings are all stored.
A switch that determines whether or not to store the factor scores for posterior analysis. NOTE: This takes an enormous amount of memory, so should only be used if the chain is thinned heavily, or for applications with a small number of observations. By default, the factor scores are not stored.
If TRUE
(the default) the continuous manifest
variables are rescaled to have zero mean.
If TRUE
(the default) the continuous manifest
variables are rescaled to have unit variance.
further arguments to be passed
An mcmc object that contains the posterior sample. This object can be summarized by functions provided by the coda package.
The model takes the following form:
Let
where
If the
If the
The implementation used here assumes independent conjugate priors for each
element of
MCMCmixfactanal
simulates from the posterior distribution using a
Metropolis-Hastings within Gibbs sampling algorithm. The algorithm employed
is based on work by Cowles (1996). Note that the first element of
As is the case with all measurement models, make sure that you have plenty of free memory, especially when storing the scores.
Kevin M. Quinn. 2004. ``Bayesian Factor Analysis for Mixed Ordinal and Continuous Responses.'' Political Analysis. 12: 338-353.
Andrew D. Martin, Kevin M. Quinn, and Jong Hee Park. 2011. ``MCMCpack: Markov Chain Monte Carlo in R.'', Journal of Statistical Software. 42(9): 1-21. http://www.jstatsoft.org/v42/i09/.
M. K. Cowles. 1996. ``Accelerating Monte Carlo Markov Chain Convergence for Cumulative-link Generalized Linear Models." Statistics and Computing. 6: 101-110.
Valen E. Johnson and James H. Albert. 1999. ``Ordinal Data Modeling." Springer: New York.
Daniel Pemstein, Kevin M. Quinn, and Andrew D. Martin. 2007. Scythe Statistical Library 1.0. http://scythe.wustl.edu.
Martyn Plummer, Nicky Best, Kate Cowles, and Karen Vines. 2006. ``Output Analysis and Diagnostics for MCMC (CODA)'', R News. 6(1): 7-11. https://CRAN.R-project.org/doc/Rnews/Rnews_2006-1.pdf.
plot.mcmc
, summary.mcmc
,
factanal
, MCMCfactanal
,
MCMCordfactanal
, MCMCirt1d
,
MCMCirtKd
# NOT RUN {
# }
# NOT RUN {
data(PErisk)
post <- MCMCmixfactanal(~courts+barb2+prsexp2+prscorr2+gdpw2,
factors=1, data=PErisk,
lambda.constraints = list(courts=list(2,"-")),
burnin=5000, mcmc=1000000, thin=50,
verbose=500, L0=.25, store.lambda=TRUE,
store.scores=TRUE, tune=1.2)
plot(post)
summary(post)
library(MASS)
data(Cars93)
attach(Cars93)
new.cars <- data.frame(Price, MPG.city, MPG.highway,
Cylinders, EngineSize, Horsepower,
RPM, Length, Wheelbase, Width, Weight, Origin)
rownames(new.cars) <- paste(Manufacturer, Model)
detach(Cars93)
# drop obs 57 (Mazda RX 7) b/c it has a rotary engine
new.cars <- new.cars[-57,]
# drop 3 cylinder cars
new.cars <- new.cars[new.cars$Cylinders!=3,]
# drop 5 cylinder cars
new.cars <- new.cars[new.cars$Cylinders!=5,]
new.cars$log.Price <- log(new.cars$Price)
new.cars$log.MPG.city <- log(new.cars$MPG.city)
new.cars$log.MPG.highway <- log(new.cars$MPG.highway)
new.cars$log.EngineSize <- log(new.cars$EngineSize)
new.cars$log.Horsepower <- log(new.cars$Horsepower)
new.cars$Cylinders <- ordered(new.cars$Cylinders)
new.cars$Origin <- ordered(new.cars$Origin)
post <- MCMCmixfactanal(~log.Price+log.MPG.city+
log.MPG.highway+Cylinders+log.EngineSize+
log.Horsepower+RPM+Length+
Wheelbase+Width+Weight+Origin, data=new.cars,
lambda.constraints=list(log.Horsepower=list(2,"+"),
log.Horsepower=c(3,0), weight=list(3,"+")),
factors=2,
burnin=5000, mcmc=500000, thin=100, verbose=500,
L0=.25, tune=3.0)
plot(post)
summary(post)
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab