Learn R Programming

EMMIXskew (version 1.0.3)

bootstrap: Bootstrap

Description

The standard error analysis and the bootstrap analysis of -2log(Lambda).

Usage

bootstrap(x,n,p,g,distr,ncov,popPAR,B=99,replace=TRUE,
itmax=1000,epsilon=1e-5)
bootstrap.noc(x,n,p,g1,g2,distr,ncov,B=99,replace=TRUE,
itmax=1000,epsilon=1e-5)

Arguments

n

The number of observations

p

The dimension of data

B

The number of simulated data or replacements to be tried

x

The dataset, an n by p numeric matrix, where n is number of observations and p the dimension of data.

g

The number of components of the mixture model

g1,g2

The range of the number of components of the mixture model

distr

A three letter string indicating the type of distribution to be fit. See Details.

ncov

A small integer indicating the type of covariance structure. See Details.

popPAR

A list with components pro, a numeric vector of the mixing proportion of each component; mu, a p by g matrix with each column as its corresponding mean; sigma, a three dimensional p by p by g array with its jth component matrix (p,p,j) as the covariance matrix for jth component of mixture models; dof, a vector of degrees of freedom for each component; delta, a p by g matrix with its columns corresponding to skew parameter vectors.

replace

A logical value indicating whether replacement to be used

itmax

A big integer specifying the maximum number of iterations to apply

epsilon

A small number used to stop the EM algorithm loop when the relative difference between log-likelihood at each iteration become sufficient small.

Value

bootstrap gives standard errors. bootstrap.noc returns a list with components ret, a B by (g2-g1) matrix of -2log(Lambda), vlk, the loglikehood for each g in the range of g1 to g2, and pvalue, the p-values of g vs g+1. The results of fitting mixture models are stored in curent working directory, which can be used via command in R: obj <- dget("ReturnOf_g_???.ret").

Details

The distribution type, distr, is one of the following values: "mvn" for a multivariate normal, "mvt" for a multivariate t-distribution, "msn" for a multivariate skew normal distribution and "mst" for a multivariate skew t-distribution.

The covariance matrix type, represented by the ncov parameter, may be any one of the following: ncov=1 for a common variance, ncov=2 for a common diagonal variance, ncov=3 for a general variance, ncov =4 for a diagonal variance, ncov=5 for sigma(h)*I(p)(diagonal covariance with same identical diagonal element values).

When replace is FALSE, parametric bootstrap is used; otherwise replacement method is used.

References

McLachlan G.J. and Krishnan T. (2008). The EM Algorithm and Extensions (2nd). New Jersay: Wiley.

McLachlan G.J. and Peel D. (2000). Finite Mixture Models. New York: Wiley.

See Also

EmSkew,rdemmix

Examples

Run this code
# NOT RUN {
n1=300;n2=300;n3=400;
nn <-c(n1,n2,n3)
n <- sum(nn)
p <- 2
g <- 3


sigma<-array(0,c(p,p,g))
for(h in 1:3) sigma[,,h]<-diag(p)

mu  <- cbind(c(4,-4),c(3.5,4),c( 0, 0))

# for other distributions, 
#delta <- cbind(c(3,3),c(1,5),c(-3,1))
#dof    <- c(3,5,5)

distr="mvn"
ncov=3

#first we generate a data set
set.seed(111) #random seed is set 
dat <- rdemmix(nn,p,g,distr,mu,sigma,dof=NULL,delta=NULL)

#start from initial partition
clust<- rep(1:g,nn)
obj <- EmSkewfit1(dat,g,clust,distr,ncov,itmax=1000,epsilon=1e-5)

# do bootstrap (stadard error analysis)

# }
# NOT RUN {
std <- bootstrap(dat,n,p,g,distr,ncov,obj,B=19,
replace=TRUE,itmax=1000,epsilon=1e-5)
print(std)

# do booststrap analysis of -2log(Lambda).

# alternatively data can be input as follow,
# dat <- read.table("mydata.txt",header=TRUE)
# p   <- ncol(dat)
# n   <- nrow(dat)

lad <- bootstrap.noc(dat,n,p,2,4,distr,ncov,B=19,
replace=FALSE,itmax=1000,epsilon=1e-5)
print(lad)

# return of g=2
obj2 <- dget("ReturnOf_g_2.ret")

# return of g=3
obj3 <- dget("ReturnOf_g_3.ret")

# return of g=4
obj4 <- dget("ReturnOf_g_4.ret")

#The posterior probability matrix for (g=3) is obtained by

tau <- obj3$tau


# }
# NOT RUN {
# }

Run the code above in your browser using DataLab