DP: Create objects of type "DP".

Description

Create an object of type "DP", which represents the Dirichlet-Process model structure: pi|alpha ~ DP(alpha,U) z|pi ~ Categorical(pi) theta_z|psi ~ H0(psi) x|theta_z,z ~ F(theta_z) where DP(alpha,U) is a Dirichlet Process on positive integers, alpha is the "concentration parameter" of the Dirichlet Process, U is the "base measure" of this Dirichlet process. The choice of F() and H0() can be arbitrary, they are distributions of x and theta_z correspondingly. In essence, An "DP" object is a combination of a "CatDP" object and an object of any "BasicBayesian" types(such as "GaussianNIG", "GaussianNIW", "CatDirichlet", and event "CatDP"). H0() and F() will be different based on different specification of the "BasicBayesian" object. The "DP" object will be used as a place for recording and accumulating information in the related inference/sampling functions such as posterior(), posteriorDiscard(), MAP() and so on.

Usage

DP(
  objCopy = NULL,
  ENV = parent.frame(),
  gamma = list(alpha = 1, H0aF = "GaussianNIW", parH0 = list(m = 0, k = 1, v = 2, S =
    1))
)

Arguments

objCopy

an object of type "DP". If "objCopy" is not NULL, the function create a new "DP" object by copying the content from objCopy, otherwise this new object will be created by using "ENV" and "gamma". Default NULL.

ENV

environment, specify where the object will be created.

gamma

list, a named list of parameters, gamma=list(alpha,H0aF,parH0). Where gamma$alpha is a numeric value specifying the concentration parameter of the Dirichlet Process. gamma$H0aF is the name of the "BasicBayesian" object which specifies the structure of H0() and F(). gamma$parH0 is the parameters passed to the selected H0aF. For example, if gamma$H0aF="GaussianNIW", then gamma$parH0 should be a named list of NIW parameters: gamma$parH0=list(m,k,v,S), where gamma$parH0$m is a numeric "location" parameter; gamma$parH0$S is a symmetric positive definite matrix representing the "scale" parameters; gamma$parH0$k and gamma$parH0$v are numeric values.

Value

An object of class "DP".

References

Teh, Yee W., et al. "Sharing clusters among related groups: Hierarchical Dirichlet processes." Advances in neural information processing systems. 2005.

Li, Yuelin, Elizabeth Schofield, and Mithat G<U+00F6>nen. "A tutorial on Dirichlet process mixture modeling." Journal of Mathematical Psychology 91 (2019): 128-144.

Examples

Run this code

# NOT RUN {
## This is an example of Gibbs sampling on Gaussian mixture models, using Dirichlet Process.

## generate some Gaussian mixture samples for demo purpose
x <- rbind(
    rGaussian(50,mu = c(-1.5,1.5),Sigma = matrix(c(0.1,0.03,0.03,0.1),2,2)),
    rGaussian(60,mu = c(-1.5,-1.5),Sigma = matrix(c(0.8,0.5,0.5,0.8),2,2)),
    rGaussian(70,mu = c(1.5,1.5),Sigma = matrix(c(0.3,0.05,0.05,0.3),2,2)),
    rGaussian(50,mu = c(1.5,-1.5),Sigma = matrix(c(0.5,-0.08,-0.08,0.5),2,2))
)

## Step1: Initialize----------------------------------------------
maxit <- 100                            #iterative for maxit times
burnin <- 50                            #number of burnin samples
z <- matrix(1L,nrow(x),maxit-burnin)    #labels
## create an "GaussianNIW" object to track all the changes.
obj <- DP(gamma = list(alpha=1,H0aF="GaussianNIW",parH0=list(m=c(0,0),k=0.001,v=2,S=diag(2))))
ss <- sufficientStatistics(obj,x=x,foreach = TRUE) #sufficient statistics of each x
N <- nrow(x)
for(i in 1L:N){ # initialize labels before Gibbs sampling
    z[i,1] <- rPosteriorPredictive(obj = obj,n=1,x=x[i,,drop=FALSE])
    posterior(obj = obj,ss = ss[[i]], z = z[i,1])
}

## Step2: Main Gibbs sampling loop--------------------------------
it <- 0                                 #iteration tracker
pb <- txtProgressBar(min = 0,max = maxit,style = 3)
while(TRUE){
    if(it>burnin) colIdx <- it-burnin
    else colIdx <- 1
    for(i in 1L:N){
        ## remove the sample information from the posterior
        posteriorDiscard(obj = obj,ss = ss[[i]],z=z[i,colIdx])
        ## get a new sample
        z[i,colIdx] <- rPosteriorPredictive(obj = obj,n=1,x=x[i,,drop=FALSE])
        ## add the new sample information to the posterior
        posterior(obj = obj,ss = ss[[i]],z=z[i,colIdx])
    }
    if(it>burnin & colIdx<ncol(z)) z[,colIdx+1] <- z[,colIdx] #copy result of previous iteration
    it <- it+1
    setTxtProgressBar(pb,it)
    if(it>=maxit){cat("\n");break}
    plot(x=x[,1],y=x[,2],col=z[,colIdx]) #to see how the labels change
}

## Step3: Estimate group labels of each observation---------------
col <- apply(z,1,function(l){
    tmp <- table(l)
    ## pick the most frequent group label in the samples to be the best estimate
    names(tmp)[which.max(tmp)]
})
plot(x=x[,1],y=x[,2],col=col)
# }

Run the code above in your browser using DataLab