rpart.plot (version 2.0.1)

ptitanic: Titanic data with passenger names and other details removed.

Description

Titanic data with passenger names and other details removed.

Arguments

Format

A data frame with 1046 observations on 6 variables.
pclass
passenger class, unordered factor: 1st 2nd 3rd
survived
factor: died or survived
sex
unordered factor: male female
age
age in years, min 0.167 max 80.0
sibsp
number of siblings or spouses aboard, integer: 0...8
parch
number of parents or children aboard, integer: 0...6

Source

The dataset was compiled by Frank Harrell and Robert Dawson: http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/titanic.html. See also: http://biostat.mc.vanderbilt.edu/twiki/pub/Main/DataSets/titanic3info.txt. For this version of the Titanic data, passenger details were deleted, survived was cast as a factor, and the name changed to ptitanic to minimize confusion with other versions. In this data the crew are conspicuous by their absence. Contents of ptitanic:
         pclass survived    sex    age sibsp parch
    1       1st survived female 29.000     0     0
    2       1st survived   male  0.917     1     2
    3       1st     died female  2.000     1     2
    4       1st     died   male 30.000     1     2
    5       1st     died female 25.000     1     2
    ...
    1309    3rd     died   male 29.000     0     0
    
How ptitanic was built:
    load("titanic3.sav") # from Dr. Harrell's web site
    # discard name, ticket, fare, cabin, embarked, body, home.dest
    ptitanic <- titanic3[,c(1,2,4,5,6,7)]
    # change survived from integer to factor
    ptitanic$survived <- factor(ptitanic$survived, labels=c("died", "survived"))
    save(ptitanic, file="ptitanic.rda")
This version of the data differs from etitanic in the earth package in that here survived is a factor (not an integer) and age has some NAs.

Examples

Run this code
data(ptitanic)
summary(ptitanic)
# main indicator of missing data is 3rd class esp. with many children
obs.with.nas <- rowSums(is.na(ptitanic)) > 0
prp(rpart(obs.with.nas~., data=ptitanic, method="class"),
    main="observations with missing data", extra=7)

Run the code above in your browser using DataCamp Workspace