This is a synthetically generated dataset containing metadata
for healthy individuals and patients diagnosed with colorectal cancer or
adenomas. The primary purpose of this dataset in the context of matching is
to balance the status groups across various covariates and achieve
optimal matching quality.
data(cancer)A data frame (cancer) with 1,224 rows and 5 columns:
Patient's health status, which can be one of the following:
healthy, adenoma, crc_benign
(benign colorectal carcinoma), or crc_malignant
(malignant colorectal carcinoma).
Patient's biological sex, recorded as either M (male) or
F (female).
Patient's age, represented as a continuous numeric variable.
Patient's Body Mass Index (BMI), represented as a continuous numeric variable.
Smoking status of the patient,
recorded as yes or no.