bfa_copula: Initialize and fit a Gaussian copula factor model

Description

This function performs a specified number of MCMC iterations and returns an object containing summary statistics from the MCMC samples as well as the actual samples of factor scores if keep.scores is TRUE. Default behavior is to save only samples of the loadings.

Usage

bfa_copula(x, data = NULL, num.factor = 1, restrict = NA, normal.dist = NA, nsim = 10000, nburn = 1000, thin = 1, print.status = 500, keep.scores = FALSE, loading.prior = c("gdp", "pointmass", "normal"), factor.scales = FALSE, px = TRUE, coda = "loadings", coda.scale = TRUE, imh = FALSE, imh.iter = 500, imh.burn = 500, ...)

Arguments

A formula or bfa object.

data

The data if x is a formula

num.factor

Number of factors

restrict

A matrix or list giving restrictions on factor loadings. A matrix should be the same size as the loadings matrix. Acceptable values are 0 (identically 0), 1 (unrestricted), or 2 (strictly positive). List elements should be character vectors of the form c('variable',1, ">0") where 'variable' is the manifest variable, 1 is the factor, and ">0" is the restriction. Acceptable restrictions are ">0" or "0".

normal.dist

A character vector specifying which variables should be treated as observed Gaussian. Defaults to none (a completely semiparametric copula model)

nsim

Number of iterations past burn-in

nburn

Number of initial (burn-in) iterations to discard

thin

Keep every thin'th MCMC sample (i.e. save nsim/thin samples)

print.status

How often to print status messages to console

keep.scores

Save samples of factor scores

loading.prior

Specify GDP ("gdp", default) point mass ("pointmass") or normal priors ("normal")

factor.scales

Include a separate scale parameter for each factor

Use parameter expansion (strongly recommended!)

coda

Create mcmc objects to allow use of functions from the coda package: "all" for loadings and scores, "loadings" or "scores" for one or the other, or "none" for neither

coda.scale

Put the loadings on the correlation scale when creating mcmc objects

imh

Use Independence Metropolis-Hastings step for discrete margins. If FALSE, use the semiparametric extended rank likelihood. If TRUE, uses a uniform prior on the cutpoints

imh.iter

Iterations used to build IMH proposal

imh.burn

Burn-in before collecting samples used to build IMH proposal (total burn-in is nburn+imh.iter+imh.burn)

...

Prior parameters and other (experimental) arguments (see details)

Value

A bfa object with posterior samples.

Details

Additional parameters:

loadings.var: Factor loading prior variance
tau.a, tau.b: Gamma hyperparameters (scale=1/b) for factor precisions (if factor.scales=T). Default is a=b=1 (MV t w/ df=2)
rho.a, rho.b: Beta hyperparameters for point mass prior
gdp.alpha, gdp.beta: GDP prior parameters

Examples

Run this code

## Not run: 
# require(MASS)
# data(UScereal)
# UScereal$shelf = factor(UScereal$shelf, ordered=TRUE)
# UScereal$vitamins = factor(UScereal$vitamins, ordered=TRUE,
#                            levels=c("none", "enriched", "100%"))
# obj = bfa_copula(~., data=UScereal[,-1], num.factor=2, nsim=10000, nburn=1000, thin=4,
#                      rest=list(c("sugars", 2, "0")), loading.prior="gdp", keep.scores=T,
#                      print.status=2500)
# biplot(obj, cex=c(0.8, 0.8))
# plot(get_coda(obj))
# plot(get_coda(obj, loadings=F, scores=T))
# hpd.int = HPDinterval(obj, scores=T)
# 
# #sample from posterior predictive
# ps = predict(obj)
# 
# m=ggplot(UScereal, aes(x=calories, y=sugars))+geom_point(position='jitter', alpha=0.5)
# m=m+stat_density2d(data=ps, aes(x=calories, y=sugars, color = ..level..), geom='contour')
# print(m)
# 
# m=ggplot(UScereal, aes(x=calories))+geom_histogram()
# m=m+stat_density(data=ps, aes(x=calories, y=..count..), color='red',fill=NA, adjust=1.3)
# m=m+facet_grid(shelf~.)
# print(m)
# 
# #we can compute conditional dist'n estimates directly as well by supplying cond.vars
# cond.vars=list(shelf=1)
# out = predict(obj, resp.var="calories", cond.vars=cond.vars)
# plot(sort(unique(UScereal$calories)), apply(out, 2,mean), type='s')
# lines(sort(unique(UScereal$calories)), apply(out, 2, quantile, 0.05), type='s', lty=2)
# lines(sort(unique(UScereal$calories)), apply(out, 2,quantile, 0.95), type='s', lt=2)
# lines(ecdf(UScereal$calories[UScereal$shelf==1]), col='blue')
# text(400, 0.1, paste("n =", sum(UScereal$shelf==1)))
# 
# out2 = predict(obj, resp.var="calories", cond.vars=list(shelf=2))
# out3 = predict(obj, resp.var="calories", cond.vars=list(shelf=3))
# plot(sort(unique(UScereal$calories)), apply(out, 2,mean), type='s')
# lines(sort(unique(UScereal$calories)), apply(out2, 2,mean), type='s', lty=2)
# lines(sort(unique(UScereal$calories)), apply(out3, 2,mean), type='s', lty=3)
# ## End(Not run)

Run the code above in your browser using DataLab