This function runs the CLERE Model. It returns an object of class
'>Clere
. For more details please refer to
clere
.
fitClere(
y,
x,
g = 1,
nItMC = 50,
nItEM = 1000,
nBurn = 200,
dp = 5,
nsamp = 200,
maxit = 500,
tol = 0.001,
nstart = 2,
parallel = FALSE,
seed = NULL,
plotit = FALSE,
sparse = FALSE,
analysis = "fit",
algorithm = "SEM",
theta0 = NULL,
Z0 = NULL
)
[numeric]: The vector of observed responses - size n
.
[matrix]: The matrix of predictors - size n
rows and
p
columns.
[integer]: Either the number or the maximum of groups for fitting CLERE. Maximum number of groups is considered when model selection is required.
[numeric]: Number of Gibbs iterations to generate the
partitions. After the nBurn
iterations, this number is automatically
set to 1
.
[numeric]: Number of SEM iterations.
[numeric]: Number of SEM iterations discarded before calculating the MLE which is averaged over SEM draws.
[numeric]: Number of iterations between sampled partitions when calculating the likelihood at the end of the run.
[numeric]: Number of sampled partitions for calculating the likelihood at the end of the run.
[numeric]: An EM algorithm is used inside the SEM to maximize
the complete log-likelihood p(y, Z|theta). maxit
stands as the
maximum number of EM iterations for the internal EM.
[numeric]: Maximum increased in complete log-likelihood for the internal EM (stopping criterion).
[integer]: Number of random starting points to be used for fitting the model.
[logical]: Should the estimation from nstart
random
starting points run in parallel?
[integer]: An integer given as a seed for random number
generation. If set to NULL
, then a random seed is generated between
1
and 1000
.
[logical]: Should a summary plot (base plot) be drawn after the run?
[logical]: Should a 0
class be imposed to the model?
[character]: Which analysis is to be performed. Values are
"fit"
, "bic"
, "aic"
and "icl"
.
[character]: The algorithm to be chosen to fit the model. Either the SEM-Gibbs algorithm or the MCEM algorithm. The most efficient algorithm being the SEM-Gibbs approach. MCEM is not available for binary response.
[vector(numeric)]: An initial guess of the model parameters.
When considering g components, the length of theta0
must be
2*g+3
and theta0
should be filled as intercept, the b_k's (g
real numbers), the pi_k's (g real numbers summing to 1), sigma^2 and gamma^2
(two positive numbers).
[vector(integer)]: A vector of integers representing an initial
partition for the variables. For 10 variables and 3 groups Z0
can be
defined as \ Z0 = c(rep(0, 2), rep(1, 3), rep(2, 5))
.
Overview : clere-package
Classes : '>Clere
, '>Pacs
Methods : plot
, clusters
, predict
, summary
Functions : fitClere
, fitPacs
Datasets : numExpRealData
, numExpSimData
, algoComp
# NOT RUN {
library(clere)
plotit <- FALSE
sparse <- FALSE
nItEM <- 100
nBurn <- nItEM / 2
nsamp <- 100
analysis <- "fit"
algorithm <- "SEM"
nItMC <- 1
dp <- 2
maxit <- 200
tol <- 1e-3
n <- 50
p <- 50
intercept <- 0
sigma <- 10
gamma <- 10
rho <- 0.5
g <- 5
probs <- c(0.36, 0.28, 0.20, 0.12, 0.04)
Eff <- p * probs
a <- 5
B <- a**(0:(g-1))-1
Z <- matrix(0, nrow = p, ncol = g)
imax <- 0
imin <- 1
for (k in 1:g) {
imin <- imax+1
imax <- imax+Eff[k]
Z[imin:imax, k] <- 1
}
Z <- Z[sample(1:p, p), ]
if (g>1) {
Beta <- rnorm(p, mean = c(Z%*%B), sd = gamma)
} else {
Beta <- rnorm(p, mean = B, sd = gamma)
}
theta0 <- NULL # c(intercept, B, probs, sigma^2, gamma^2)
Z0 <- NULL # apply(Z, 1, which.max)-1
gmax <- 7
## Prediction
eps <- rnorm(n, mean = 0, sd = sigma)
X <- matrix(rnorm(n*p), nrow = n, ncol = p)
Y <- as.numeric(intercept+X%*%Beta+eps)
tt <- system.time(mod <- fitClere(y = Y, x = X, g = gmax,
analysis = analysis,algorithm = algorithm,
plotit = plotit,
sparse = FALSE,nItEM = nItEM,
nBurn = nBurn, nItMC = nItMC,
nsamp = nsamp, theta0 = theta0, Z0 = Z0) )
plot(mod)
Yv <- predict(object = mod, newx = X)
# }
Run the code above in your browser using DataLab