Learn R Programming

regclass (version 1.5)

visualize_relationship: Visualizing the relationship between y and x in a partition model

Description

Attempts to show how the relationship between y and x is being modeled in a partition or random forest model

Usage

visualize_relationship(TREE,interest,on,smooth=TRUE,marginal=TRUE,nplots=5, seed=NA,pos="topright",...)

Arguments

TREE
A partition or random forest model (though it works with many regression models as well)
interest
The name of the predictor variable for which the plot of y vs. x is to be made.
on
A dataframe giving the values of the other predictor variables for which the relationship is to be visualized. Typically this is the dataframe on which the partition model was built.
smooth
If TRUE, the relationship is plotted using a loess to smooth out the relationship
marginal
If TRUE, the modeled value of y at a particular value of x is the average of the predicted values of y over all rows which have that common value of x. If FALSE, then nplots rows from on will be selected and all other predictors will be fixed, showing the relationship between y and x for that particular set of characteristics.
nplots
The number of rows of on for which the relationship is plotted (if marginal is set to FALSE)
seed
the seed for the random number seed if reproducibility is required
pos
the location of the legend
...
additional arguments past to plot, namely xlim and ylim

Details

The function shows a scatterplot of y vs. x in the on dataframe, then shows how TREE is modeling the relationship between y and x with predicted values of y for each row in the data and also a curve illustrating the relationship. It is useful for seeing what the relationship between y and x as modeled by TREE "looks like", both as a whole and for particular combinations of other variables. If marginal is FALSE, then differences in the curves indicate the presence of some interaction between x and another variable.

References

Introduction to Regression and Modeling

See Also

loess, lm, glm

Examples

Run this code
  data(SALARY)
  FOREST <- randomForest(Salary~.,data=SALARY)
  visualize_relationship(FOREST,interest="Experience",on=SALARY)
  visualize_relationship(FOREST,interest="Months",on=SALARY,xlim=c(1,15),ylim=c(2500,4500))

  data(WINE)
  TREE <- rpart(Quality~.,data=WINE)
  visualize_relationship(TREE,interest="alcohol",on=WINE,smooth=FALSE)
  visualize_relationship(TREE,interest="alcohol",on=WINE,marginal=FALSE,nplots=7,smooth=FALSE)

Run the code above in your browser using DataLab