The well-known wine dataset comprises the results of chemical analyses of 178 wines produced in the same region of Italy from three grape varieties (Barolo, Grignolino, and Barbera). The dataset was originally introduced by Forina et al. (1984) and later described in detail by Forina et al. (1986).
A data frame with 178 observations, 13 numeric covariates and one binary target variable.
For each sample, 13 continuous chemical constituents were measured, which serve as
covariates for distinguishing between the grape varieties. For the analyses in this
package, a version of the dataset with a binary outcome is provided that differentiates
between Grignolino ("G") and the two other varieties ("Other"; Barolo and Barbera).
This version is available on OpenML under data ID 973.
The variables are as follows:
Alc. numeric. Alcohol.
Mal. numeric. Malic acid.
Ash. numeric. Ash.
AlcAsh. numeric. Alkalinity of ash.
Mg. numeric. Magnesium.
TP. numeric. Total phenols.
Fla. numeric. Flavonoids.
NFP. numeric. Nonflavonoid phenols.
ProAn. numeric. Proanthocyanins.
Col. numeric. Color intensity.
Hue. numeric. Hue.
WAI. numeric. OD280/OD315 of diluted wines (wine absorbance index).
Prol. numeric. Proline.
C. factor. Cultivar. Binary target variable: "G" vs "Other".
Forina, M. (1984). PARVUS, TrAC Trends in Analytical Chemistry, 3(2):38–39, <tools:::Rd_expr_doi("10.1016/0165-9936(84)87050-8")>.
Forina, M., Armanino, C., Castino, M., Ubigli, M. (1986). Multivariate data analysis as a discriminating method of the origin of wines, Vitis, 25:189--201, <tools:::Rd_expr_doi("10.5073/vitis.1986.25.189-201")>.
Vanschoren, J., van Rijn, J. N., Bischl, B., Torgo, L. (2013). OpenML: networked science in machine learning. SIGKDD Explorations, 15(2):49--60, <tools:::Rd_expr_doi("10.1145/2641190.2641198")>.
data(wine)
table(wine$C)
dim(wine)
head(wine)
Run the code above in your browser using DataLab