Learn R Programming

conting (version 1.4)

bayespval: Compute Bayesian p-value

Description

This function will compute the Bayesian (or posterior predictive) p-value. This can be used as a diagnostic tool to check model adequacy. Additionally this function outputs predictions from the model which can also be used in other assessments of model adequacy.

Usage

bayespval(object, n.burnin = 0, thin = 1, statistic = "X2")

Arguments

object
An object of class "bcct" or "bict" object.
n.burnin
An optional argument giving the number of iterations to use as burn-in. The default value is 0.
thin
An optional argument giving the amount of thinning to use, i.e. the computations are based on every thin-th value in the MCMC sample. The default value is 1, i.e. no thinning.
statistic
An optional argument giving the discrepancy statistic to use for calculating the Bayesian p-value. It can be one of c("X2","FreemanTukey","deviance") which correspond to the different statistics: "X2" = Chi-squared statistic,

Value

  • The function will produce an object of class "pval" which is a list with the following components.
  • PREDAn (n.sample-n.burnin) by n* matrix where (n* is the number of observed cell counts) containing the predictions of the observed cell counts.
  • TpredA vector of length (n.sample-n.burnin) containing the discrepancies between the predicted cell counts and their means.
  • TobsA vector of length (n.sample-n.burnin) containing the discrepancies between the observed cell counts and their means.
  • pvalA scalar giving the Bayesian p-value, i.e. the proportion of Tpred>Tobs.
  • statnumA numeric scalar identifying which statistic is used.
  • statisticA character string identifying which statistic is used.
  • thinThe value of the argument thin.

Details

See Gelman et al (2004, Chapter 6) for more details on Bayesian p-values and see Overstall & King (2014), and references therein, for details of their application to contingency tables. The use of thinning is recommended when the number of MCMC iterations and/or the number of log-linear parameters in the maximal model are/is large, which may cause problems with comuter memory storage.

References

Gelman, A., Carlin, J.B., Stern, H.S. & Rubin, D.B. (2004) Bayesian Data Analysis, 2nd edition, Chapman & Hall. Overstall, A.M. & King, R. (2014) conting: An R package for Bayesian analysis of complete and incomplete contingency tables. Journal of Statistical Software, 58 (7), 1--27. http://www.jstatsoft.org/v58/i07/

See Also

bict, bcct, print.pval.

Examples

Run this code
set.seed(1)
## Set seed for reproducibility
data(spina)
## Load spina data

test1<-bict(formula=y~(S1+S2+S3+eth)^2,data=spina,n.sample=50,prior="UIP")
## Do 50 iterations starting at maximal model containing all two-way interactions.

test1p<-bayespval(object=test1,statistic="FreemanTukey",n.burnin=5)
## Use the Freeman-Tukey statistic and a burn-in phase of 5 iterations.
test1p
## Will get following output

#Under the Freeman-Tukey statistic 
#
#Summary statistics for T_pred 
#   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#  2.812   4.695   5.190   5.777   6.405  14.490 
#
#Summary statistics for T_obs 
#   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#  4.566   4.861   5.197   5.430   6.108   6.460 
#
#Bayesian p-value =  0.4667

## Can do a plot

plot(test1p)

Run the code above in your browser using DataLab