"The dataset was collected at 'Hospital Universitario de Caracas' in Caracas, Venezuela. The dataset comprises demographic information, habits, and historic medical records of 858 patients. Several patients decided not to answer some of the questions because of privacy concerns (missing values)." I cleaned up the data so there are no missing data points, nor any NAs.
Age
Number of reported sexual partners
Age at first sexual intercourse
Reported number of pregnancies
Whether the subject smokes
The number of years the subject reported smoking
The number of packs of cigarettes the subject reports smoking each year
If the subject is using hormonal contraceptives
Number of years the subject reports using hormonal contraceptives
Does the subject use an IUD?
Number of years the subject reports using an IUD
Does the patient have STDs?
Number of STDs
Does the patient have condylomatosis?
Does the patient have cervical condylomatosis?
Does the patient have vaginal condylomatosis?
Does the patient have vulvo perineal condylomatosis?
Does the patient have Syphilis?
Does the patient have pelvic inflammatory disease?
Does the patient have genitial herpes?
Does the patient have molluscum contagiosum?
Does the patient have AIDS?
Does the patient have hepatitis B?
Number of diagnoses of STDs
Does the patient have a diagnosis of cancer?
Does the patient have a diagnosis of CIN?
Does the patient have a diagnosis of HPV?
What is the patient's diagnosis?
Hinselmann
Schiller
Citology
The target column, 1 = yes, 0 = no
Cervical_cancerAn object of class data.frame with 858 rows and 34 columns.