Learn R Programming

bapred (version 0.2)

fabatch: Batch effect adjustment using FAbatch

Description

Performs batch effect adjustment using the FAbatch-method described in Hornung et al. (2015) and additionally returns information necessary for addon batch effect adjustment with FAbatch.

Usage

fabatch(x, y, batch, nbf = NULL, minerr = 1e-06, 
  probcrossbatch = TRUE, maxiter = 100, maxnbf = 12)

Arguments

x
matrix. The covariate matrix. Observations in rows, variables in columns.
y
factor. Binary target variable. Currently has to have levels '1' and '2'.
batch
factor. Batch variable. Currently has to have levels: '1', '2', '3' and so on.
nbf
integer. Number of factors to estimate in all batches. If not given the number of factors is estimated automatically for each batch. Recommended to leave unspecified.
minerr
numeric. Maximal mean quadratic deviations between the estimated residual variances from two consecutive iterations. The iteration stops when this value is undercut.
probcrossbatch
logical. Default is TRUE. If TRUE the preliminary probabilities are estimated through leave-one-batch-out cross-validation. If set to FALSE ordinary cross-validation is used for estimating the preliminary probabiliti
maxiter
integer. Maximal number of iterations in the estimation of the latent factors by Maximum Likelihood.
maxnbf
integer. Maximal number of factors if nbf is not given. Default is the largest integer smaller than half the number of observations in the corresponding batch.

Value

  • fabatch returns an object of class fabatch. An object of class "fabatch" is a list containing the following components:
  • xadjmatrix of adjusted (training) data
  • m1means of the standardized variables in class '1'
  • m2means of the standardized variables in class '2'
  • b0intercept out of the L2-penalized logistic regression performed for estimation of the class probabilities
  • bvariable coefficients out of the L2-penalized logistic regression performed for estimation of the class probabilities
  • pooledsdsvector containing the pooled standard deviations of the variables
  • meanoverallvector containing the variable means
  • minerrmaximal mean quadratic deviations between the estimated residual variances from two consecutive iterations
  • nbfinputuser-specified number of latent factors nbf in all batches. NULL if nbf was not specified.
  • badvariablesindices of those variables which are constant in at least one batch
  • nbatchesnumber of batches
  • batchbatch variable
  • nbfvecvector containing the numbers of factors in the individual batches

References

Hornung, R., Boulesteix, A.-L., Causeur, D. (2015) Combining location-and-scale batch effect adjustment with data cleaning by latent factor adjustment. Tech. Rep. 184, Department of Statistics, University of Munich.

Examples

Run this code
data(autism)

# Random subset of 150 variables:
set.seed(1234)
Xsub <- X[,sample(1:ncol(X), size=150)]

# In cases of batches with more than 20 observations
# select 20 observations at random:
subinds <- unlist(sapply(1:length(levels(batch)), function(x) {
  indbatch <- which(batch==x)
  if(length(indbatch) > 20)
    indbatch <- sort(sample(indbatch, size=20))
  indbatch
}))
Xsub <- Xsub[subinds,]
batchsub <- batch[subinds]
ysub <- y[subinds]



fabatch(x=Xsub, y=ysub, batch=batchsub)

Run the code above in your browser using DataLab