This is a synthetically generated dataset containing metadata for healthy
individuals and patients diagnosed with colorectal cancer or adenomas. The
primary purpose of this dataset in the context of matching is to balance the
status
groups across various covariates and achieve optimal matching
quality.
data(cancer)
A data frame (cancer
) with 1,224 rows and 5 columns:
Patient's health status, which can be one of the following:
healthy
, adenoma
, crc_benign
(benign colorectal carcinoma), or crc_malignant
(malignant colorectal carcinoma).
Patient's biological sex, recorded as either M
(male) or
F
(female).
Patient's age, represented as a continuous numeric variable.
Patient's Body Mass Index (BMI), represented as a continuous numeric variable.
Smoking status of the patient,
recorded as yes
or no
.