Learn R Programming

cheem (version 0.4.2)

amesHousing2018: Ames housing data 2018

Description

House sales prices from Ames, Iowa, USA between 2006 and 2010. Only complete numeric observations remain.

Usage

amesHousing2018

amesHousing2018_raw

amesHousing2018_NorthAmes

Arguments

Format

complete data.frame with 2291 rows and 18 numeric variables, SalesPrice, the response variable, and 3 class variables

An object of class data.frame with 2930 rows and 82 columns.

An object of class data.frame with 338 rows and 11 columns.

Details

amesHousing2018

Complete data.frame, n = 2291, 18 numeric variable (including 2 temporal: MoSold, YrSold ), response variable SalePrice, 3 class factors.

amesHousing2018_NorthAmes

A simplified subsample, just North Ames (largest neighborhood). Complete data.frame, n = 338, 9 numeric variables, response variable SalePrice, 1 class factor SubclassMS, a zoning subclass.

amesHousing2018_raw

Original data from Kaggle, 2930 rows of 82 variables. Sparse rows (639) and sparse/defaulted columns (64) are removed.

No data dictionary is provided on Kaggle, but amesHousing2018 variables are inferred to be:

  • LotFrontage, Length of the front (street facing) side of the lot in yards (0.914m)

  • LotArea, Area of the lot in square yards (0.836m^2)

  • OverallQual, Overall quality (of the house?), integer in (1, 10)

  • OverallCond, Overall condition (of the lot?), integer in (1, 10)

  • YearBuild, The year the house was originally built

  • BsmtUnfArea, Unfinished basement area, in square yards (0.836m^2)

  • TotBsmtArea, Total basement area, in square yards (0.836m^2)

  • 1stFlrArea, First (ground) floor living area in square yards (0.836m^2)

  • LivingArea, Total living area in square yards (0.836m^2)

  • Bathrms, The number of bathrooms

  • Bedrms, The number of bedrooms

  • TotRms, The total number of rooms

  • GarageYrBlt, The year the garage was build

  • GarageCars, The number of car spaces in the garage

  • GarageArea, The area of the garage in square yards (0.836m^2)

  • MoSold, The number of the month of the house sale

  • YrSold, The number of the year of the house sale

  • SalePrice, The sale of the house in USD (as of the year of sale?)

  • SubclassMS, Factor subclass of construction zone, 16 levels

  • SubclassMS, Factor major class of construction zone, 7 levels

  • Neighborhd, Factor neighborhood of Ames, IA, 28 levels

Examples

Run this code
library(cheem)

## Regression setup:
dat  <- amesHousing2018_NorthAmes
X    <- dat[, 1:9]
Y    <- dat$SalePrice
clas <- dat$SubclassMS

## Cheem list
ames_rf_chm <- cheem_ls(X, Y, ames_rf_shap, ames_rf_pred, clas,
                        label = "North Ames, RF, SHAP")
## Cheem visuals
if(interactive()){
  prim <- 1
  comp <- 2
  global_view(ames_rf_chm, primary_obs = prim, comparison_obs = comp)
  bas <- sug_basis(ames_rf_chm, prim, comp)
  mv  <- sug_manip_var(ames_rf_chm, primary_obs = prim, comp)
  ggt <- radial_cheem_tour(ames_rf_chm, basis = bas, manip_var = mv)
  animate_plotly(ggt)
}

## Save for use with shiny app (expects an rds file)
if(FALSE){ ## Don't accidentally save.
  saveRDS(ames_rf_chm, "./chm_NAmes_rf_tshap.rds")
  run_app() ## Select the saved rds file from the data drop down.
}

Run the code above in your browser using DataLab