Learn R Programming

Mercator (version 1.1.6)

mercator-data: CML Cytogenetic Data

Description

These data sets contain binary versions of subsets of cytogenetic karyotype data from patients with chronic myelogenous leukemia (CML).

Usage

data("lgfFeatures")
data("CML500")
data("CML1000")
data("fakedata") # includes "fakeclin"

Arguments

Format

lgfFeatures

A data matrix with 2748 rows and 6 columns listing the cytogentic bands produced as output of the CytoGPS algorithm that converts text-based karyotypes into a binary Loss-Gain-Fusion (LGF) model. The columns include the Label (the Type and Band, joined by an underscore), Type (Loss, Gain, or Fusion), Band (standard name of the cytogenetic band), Chr (chromosome), Arm (the chromsome arm, of the form #p or #q), and Index (an integer that can be used for sorting or indexing).

CML500

A BinaryMatrix object with 770 rows (subset of LGF features) and 511 columns (patients). The patients were selected using the downsample function from the full set of more than 3000 CML karyotypes. The rows were selected by removing redundant and non-informative features when considering the full data set.

CML1000

A BinaryMatrix object with 770 rows (subset of LGF features) and 1057 columns (patients). The patients were selected using the downsample function from the full set of more than 3000 CML karyotypes. The rows were selected by removing redundant and non-informative features when considering the full data set.

fakedata

A matrix with 776 rows ("features") and 300 columns ("samples") containng synthetic continuos data.

fakeclin

A data frame with 300 rows ("samples") and 4 columns of synthetic clincal data related to the fakedata.

Author

Kevin R. Coombes <krc@silicovore.com>, Caitlin E. Coombes

References

Abrams ZB, Zhang L, Abruzzo LV, Heerema NA, Li S, Dillon T, Rodriguez R, Coombes KR, Payne PRO. CytoGPS: A Web-Enabled Karyotype Analysis Tool for Cytogenetics. Bioinformatics. 2019 Jul 2. pii: btz520. doi: 10.1093/bioinformatics/btz520. [Epub ahead of print]