crossval: Cross-Validation of BB Learning

Description

Run multiple fittings of bbl model with training/validation division of data

Usage

crossval(object, lambda = 0.1, lambdah = 0, eps = 0.9, nfold = 5,
  method = "pseudo", naive = FALSE, use.auc = TRUE, verbose = 1,
  useC = TRUE, prior.count = TRUE, progress.bar = FALSE,
  fixL = FALSE, ...)

Arguments

object

Object of class bbl containing data.

lambda

Vector of L2 penalizer values for method = 'pseudo'. Inferences will be repeated for each value. Restricited to non-negative values.

lambdah

L2 penalizer in method = 'pseudo' applied to parameter h. In contrast to lambda, only a single value is allowed.

eps

Vector of regularization parameters, \(\epsilon\in[0,1]\), for method = 'mf'. Inference will be repeeated for each value.

nfold

Number of folds for training/validation split.

method

c('pseudo','mf') for pseudo-likelihood maximization or mean field.

naive

Naive Bayes (no interactions). Equivalent to method = 'mf' together with eps = 0.

use.auc

Use AUC as the measure of prediction accuracy. Only works if response groups are binary. If FALSE, mean prediction group accuracy will be used as score.

verbose

Verbosity level. Downgraded when relayed into train.

useC

Use C++ version in predict method of bbl.

prior.count

Use prior count in method = 'mf'.

progress.bar

Display progress bar in predict.

fixL

Do not alter the levels of predictors in training step.

...

Other parameters to mlestimate.

Value

Data frame of regularization parameter values and validation scores.

Details

The data slot of object is split into training and validation subsets of (nfold-1):1 ratio. The model is trained with the former and validated on the latter. Individual division/fold results are combined into validation result for all instances in the data set and prediction score is evaluated using the known response group identity.

Examples

Run this code

# NOT RUN {
set.seed(513)
m <- 5
n <- 100
predictors <- list()
for(i in 1:m) predictors[[i]] <- c('a','c','g','t')

par0 <- randompar(predictors)
xi0 <- sample_xi(nsample=n, predictors=predictors, h=par0$h, J=par0$J)
par1 <- randompar(predictors, h0=0.1, J0=0.1)
xi1 <- sample_xi(nsample=n, predictors=predictors, h=par1$h, J=par1$J)
xi <- rbind(xi0, xi1)
dat <- cbind(xi, data.frame(y=c(rep('control',n),rep('case',n))))
model <- bbl(data=dat, groups=c('control','case'))

cv <- crossval(object=model, method='mf', eps=seq(0.1,0.9,0.1))
plot(cv, type='b')
# }

Run the code above in your browser using DataLab