A data set with 150 DNA sequences. Each string is
a nucleotide sequence that corresponds to
the promoter region of a gene from the human chromosome no. 22
(according to the human genome assembly hg18). The sequences
start 999 bases upstream of the transcription start site (TSS)
and end with the TSS itself.
The names attribute contains the RefSeq IDs of the genes.
In previous version of the apcluster package, this was an
R object that can be loaded via data(ch22Promoters)
. For
better compatibility with the kebabs package, the data set
has been moved to a plain text file (in FASTA format)
that can be loaded from inst/examples
(see examples below).