The yeast dataset consists of 1484 yeast sequences which were classified into 10 classes. It has the 8 putative variables that may predict the response variable. The yeast sequences are from SWISS-PROT database. There are 463 CYT, 429 NUC, 244 MIT, 163 ME3, 51 ME2, 44 ME1, 35 EXC, 30 VAC, 20 POX and 5 ERL in yeast dataset.
data("yeast")
A data.frame containing 1484 yeast sequences.
Horton, Paul and Nakai, Kenta (1996) A probabilistic classification system for predicting the cellular localization sites of proteins. Ismb, volume 4, pages 109--115.