Learn R Programming

EpidigiR (version 0.1.2)

ml_data: Machine Learning Data for Disease Risk Prediction

Description

A dataset containing simulated patient data for predicting disease risk, suitable for logistic regression, clustering, Random Forest, and SVM.

Usage

ml_data

Arguments

Format

A data frame with 100 rows and 5 columns:

outcome

Numeric, binary disease status (0 = healthy, 1 = diseased).

age

Numeric, patient age (years).

exposure

Numeric, exposure level (0 to 1, e.g., environmental risk).

genetic_risk

Numeric, genetic risk score (0 to 1).

region

Character, region name (e.g., North, South, East, West).

Examples

Run this code
data("ml_data")
ml_data$outcome <- as.factor(ml_data$outcome)
epi_model(ml_data, formula = outcome ~ age + exposure + genetic_risk, type = "logistic")
epi_model(ml_data, formula = outcome ~ age + exposure + genetic_risk, type = "rf")
epi_visualize(ml_data, x = "age", y = "outcome", type = "scatter")

Run the code above in your browser using DataLab