Learn R Programming

KODAMA (version 1.6)

lymphoma: Lymphoma Gene Expression Dataset

Description

This dataset consists of gene expression profiles of the three most prevalent adult lymphoid malignancies: diffuse large B-cell lymphoma (DLBCL), follicular lymphoma (FL), and B-cell chronic lymphocytic leukemia (B-CLL). The dataset consists of 4,682 mRNA genes for 62 samples (42 samples of DLBCL, 9 samples of FL, and 11 samples of B-CLL). Missing value are imputed and data are standardized as described in Dudoit, et al. (2002).

Usage

data(lymphoma)

Arguments

Value

A list with the following elements:

data

Gene expression data. A matrix with 62 rows and 4,682 columns.

class

Class index. A vector with 62 elements.

References

Cacciatore S, Luchinat C, Tenori L Knowledge discovery by accuracy maximization. Proc Natl Acad Sci U S A 2014;111(14):5117-22. doi: 10.1073/pnas.1220873111. Link

Cacciatore S, Tenori L, Luchinat C, Bennett PR, MacIntyre DA KODAMA: an updated R package for knowledge discovery and data mining. Bioinformatics 2017;33(4):621-623. doi: 10.1093/bioinformatics/btw705. Link

Alizadeh AA, Eisen MB, Davis RE, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000;403(6769):503-511.

Dudoit S, Fridlyand J, Speed TP Comparison of discrimination methods for the classification of tumors using gene expression data. J Am Stat Assoc 2002;97(417):77-87.

Examples

Run this code
# NOT RUN {
data(lymphoma)
class=1+as.numeric(lymphoma$class)
cc=prcomp(lymphoma$data)$x
plot(cc,pch=21,bg=class,xlab="First Component",ylab="Second Component")

kk=KODAMA(lymphoma$data)
plot(kk$pp,pch=21,bg=class,xlab="First Component",ylab="Second Component")

# }

Run the code above in your browser using DataLab