The mortgage dataset contains 850 records and 8 variables.
The target variable is risk, a factor with two levels, "low" and "high".
The remaining seven variables serve as predictors.
The dataset was simulated to represent a realistic mortgage application setting.
data(mortgage)A data frame with \(850\) rows (applicants) and \(8\) variables:
age: Age in years.
income: Annual income.
savings: Total savings.
employment_status: A factor with levels "permanent", "temporary", "self_employed", and "unemployed".
credit_history: A factor with levels "poor", "average", and "good".
debt_level: A factor with levels "low", "medium", and "high".
loan_amount: Requested loan amount.
risk: A factor with levels "low" and "high".
The dataset was generated using a hybrid latent simulation approach. Continuous variables were simulated with dependence, and categorical variables were derived from latent scores to create realistic relationships among applicant characteristics, financial indicators, and mortgage risk.
Reza Mohammadi (2025). Data Science Foundations and Machine Learning with R: From Data to Decisions. https://book-data-science-r.netlify.app.
bank,
churn_mlc,
churn,
churn_tel,
adult,
cereal,
advertising,
marketing,
drug,
house,
house_price,
red_wines,
white_wines,
insurance,
caravan,
loan