This is a data set of copy number variation data with $n=500$ observations and $p=1000$ features. The length $n$ batch vector (first column of caseDat) indicates the batch for each sample.
Usage
data(caseDat)
Arguments
Format
A list with two objects:
batch
A numeric vector indicating batch for the $n=500$ samples.
data
A matrix of $n=500$ samples and $p=1000$ features.
References
Reese, S. E., Archer, K. J., Therneau, T. M., Atkinson, E. J., Vachon, C. M., de Andrade, M., Kocher, J. A., and Eckel-Passow, J. E. A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal components analysis. Bioinformatics, (in review).