party (version 0.9-9999)

readingSkills: Reading Skills

Description

A toy data set illustrating the spurious correlation between reading skills and shoe size in school-children.

Usage

data("readingSkills")

Arguments

Details

In this artificial data set, that was generated by means of a linear model, age and nativeSpeaker are actual predictors of the score, while the spurious correlation between score and shoeSize is merely caused by the fact that both depend on age.

The true predictors can be identified, e.g., by means of partial correlations, standardized beta coefficients in linear models or the conditional random forest variable importance, but not by means of the standard random forest variable importance (see example).

Examples

Run this code
set.seed(290875)
   readingSkills.cf <- cforest(score ~ ., data = readingSkills,
       control = cforest_unbiased(mtry = 2, ntree = 50))

   varimp(readingSkills.cf)

   varimp(readingSkills.cf, conditional = TRUE)

Run the code above in your browser using DataCamp Workspace