Learn R Programming

bayesm (version 2.2-1)

rDPGibbs: Density Estimation with Dirichlet Process Prior and Normal Base

Description

rDPGibbs implements a Gibbs Sampler to draw from the posterior for a normal mixture problem with a Dirichlet Process prior. A natural conjugate base prior is used along with priors on the hyper parameters of this distribution. One interpretation of this model is as a normal mixture with a random number of components that can grow with the sample size.

Usage

rDPGibbs(Prior, Data, Mcmc)

Arguments

Prior
list(Prioralpha,lambda_hyper)
Data
list(y)
Mcmc
list(R,keep,maxuniq,SCALE,gridsize)

Value

  • nmixa list containing: probdraw,zdraw,compdraw
  • alphadrawvector of draws of DP process tightness parameter
  • nudrawvector of draws of base prior hyperparameter
  • adrawvector of draws of base prior hyperparameter
  • vdrawvector of draws of base prior hyperparameter

concept

  • bayes
  • MCMC
  • normal mixtures
  • Dirichlet Process
  • Gibbs Sampling

cr

    • Istarmin
    {expected number of components at lower bound of support of alpha} Istarmax{expected number of components at upper bound of support of alpha} power{power parameter for alpha prior}
    • alim
    {defines support of a distribution,def:c(.01,10) } nulim{defines support of nu distribution, def:c(.01,3)} vlim{defines support of v distribution, def:c(.1,4)}
    • R
    {number of mcmc draws} keep{thinning parm, keep every keepth draw} maxuniq{storage constraint on the number of unique components} SCALE{should data be scaled by mean,std deviation before posterior draws, def: TRUE} gridsize{number of discrete points for hyperparameter priors,def: 20}
  • the basic output are draws from the predictive distribution of the data in the object, nmix. The average of these draws is the Bayesian analogue of a density estimate. nmix:
    • probdraw
    {R/keep x 1 matrix of 1s} zdraw{R/keep x N matrix of draws of indicators of which component each obs is assigned to} compdraw{R/keep list of draws of normals}
  • compdraw[[i]] is ith draw -- list of lists. compdraw[[i]][[1]] is list of parms for a draw from predictive. compdraw[[i]][1]][[1]] is the mean vector. compdraw[[i]][[1]][[2]] is the inverse of Cholesky root. $Sigma$ = t(R)%*%R, $R^{-1}$ = compdraw[[i]][[1]][[2]].

Details

Model: $y_i$ $\sim$ $N(mu_i,Sigma_i)$. Priors: $theta_i=(mu_i,Sigma_i)$ $\sim$ $DP(G_0(lambda),alpha)$ $G_0(lambda):$ $mu_i | Sigma_i$ $\sim$ $N(0,Sigma_i (x) a^{-1})$ $Sigma_i$ $\sim$ $IW(nu,nu*v*I)$ $lambda(a,nu,v):$ $a$ $\sim$ uniform on grid[alim[1],alimb[2]] $nu$ $\sim$ uniform on grid[dim(data)-1 + exp(nulim[1]),dim(data)-1 +exp(nulim[2])] $v$ $\sim$ uniform on grid[vlim[1],vlim[2]] $alpha$ $\sim$ $(1-(alpha-alphamin)/(alphamax-alphamin))^power$ alpha= alphamin then expected number of components = Istarmin alpha= alphamax then expected number of components = Istarmax list arguments Data:
  • y
{N x k matrix of observations on k dimensional data}

See Also

rnmixGibbs,rmixture, rmixGibbs , eMixMargDen, momMix, mixDen, mixDenBi

Examples

Run this code
if(nchar(Sys.getenv("LONG_TEST")) != 0) {R=2000} else {R=10}

## simulate univariate data from Chi-Sq

set.seed(66)
N=200
chisqdf=8; y1=as.matrix(rchisq(N,df=chisqdf))

## set arguments for rDPGibbs

Data1=list(y=y1)
Prioralpha=list(Istarmin=1,Istarmax=10,power=.8)
Prior1=list(Prioralpha=Prioralpha)

Mcmc=list(R=R,keep=1,maxuniq=200)

out1=rDPGibbs(Prior=Prior1,Data=Data1,Mcmc)

if(0){
## plotting examples
rgi=c(0,20); grid=matrix(seq(from=rgi[1],to=rgi[2],length.out=50),ncol=1)
deltax=(rgi[2]-rgi[1])/nrow(grid)
plot(out1$nmix,Grid=grid,Data=y1)
## plot true density with historgram
plot(range(grid[,1]),1.5*range(dchisq(grid[,1],df=chisqdf)),type="n",xlab=paste("Chisq ; ",N,"obs",sep=""), ylab="")
hist(y1,xlim=rgi,freq=FALSE,col="yellow",breaks=20,add=TRUE)
lines(grid[,1],dchisq(grid[,1],df=chisqdf)/(sum(dchisq(grid[,1],df=chisqdf))*deltax),col="blue",lwd=2)
}


## simulate bivariate data from the  "Banana" distribution (Meng and Barnard) 
banana=function(A,B,C1,C2,N,keep=10,init=10)
{ R=init*keep+N*keep
   x1=x2=0
   bimat=matrix(double(2*N),ncol=2)
  for (r in 1:R)
  { x1=rnorm(1,mean=(B*x2+C1)/(A*(x2^2)+1),sd=sqrt(1/(A*(x2^2)+1)))
  x2=rnorm(1,mean=(B*x2+C2)/(A*(x1^2)+1),sd=sqrt(1/(A*(x1^2)+1)))
  if (r>init*keep && r%%keep==0) {mkeep=r/keep; bimat[mkeep-init,]=c(x1,x2)} }
return(bimat)
}


set.seed(66)
nvar2=2
A=0.5; B=0; C1=C2=3
y2=banana(A=A,B=B,C1=C1,C2=C2,1000)

Data2=list(y=y2)
Prioralpha=list(Istarmin=1,Istarmax=10,power=.8)
Prior2=list(Prioralpha=Prioralpha)
Mcmc=list(R=R,keep=1,maxuniq=200)

out2=rDPGibbs(Prior=Prior2,Data=Data2,Mcmc)


if(0){
## plotting examples

rx1=range(y2[,1]); rx2=range(y2[,2])
x1=seq(from=rx1[1],to=rx1[2],length.out=50)
x2=seq(from=rx2[1],to=rx2[2],length.out=50)
grid=cbind(x1,x2)

plot(out2$nmix,Grid=grid,Data=y2)

## plot true bivariate density
tden=matrix(double(50*50),ncol=50)
for (i in 1:50){ for (j in 1:50) 
      {tden[i,j]=exp(-0.5*(A*(x1[i]^2)*(x2[j]^2)+(x1[i]^2)+(x2[j]^2)-2*B*x1[i]*x2[j]-2*C1*x1[i]-2*C2*x2[j]))}
}
tden=tden/sum(tden)
image(x1,x2,tden,col=terrain.colors(100),xlab="",ylab="")
contour(x1,x2,tden,add=TRUE,drawlabels=FALSE)
title("True Density")
}

Run the code above in your browser using DataLab