Learn R Programming

treeheatr: an introduction

Your decision tree may be cool, but what if I tell you you can make it hot?

Changes in treeheatr 0.2.0

The first argument of heat_tree(), data is now replaced with x, which can be a dataframe (or tibble), a party (or constparty) object specifying the precomputed tree, or partynode object specifying the customized tree. custom_tree argument is no longer needed.

Install

Please make sure your version of R >= 3.5.0 before installation.

You can install the released version of treeheatr from CRAN with:

install.packages('treeheatr')

Or the development version from GitHub with remotes:

# install.packages('remotes') # uncomment to install devtools
remotes::install_github('trang1618/treeheatr')

Examples

Penguin dataset

Classification of different types of penguin species.

library(treeheatr)

heat_tree(penguins, target_lab = 'species')

Wine recognition dataset

Classification of different cultivars of wine.

heat_tree(wine, target_lab = 'Type', target_lab_disp = 'Cultivar')

Citing treeheatr

If you use treeheatr in a scientific publication, please consider citing the following paper:

Le TT, Moore JH. treeheatr: an R package for interpretable decision tree visualizations. Bioinformatics. 2020 Jan 1.

BibTeX entry:

@article{le2020treeheatr,
  title={treeheatr: an R package for interpretable decision tree visualizations},
  author={Le, Trang T and Moore, Jason H},
  journal={Bioinformatics},
  year={2020},
  doi="10.1093/bioinformatics/btaa662"
}

How to Use

treeheatr incorporates a heatmap at the terminal node of your decision tree. The basic building blocks to a treeheatr plot are (yes, you guessed it!) a decision tree and a heatmap.

  • The decision tree is computed with partykit::ctree() and plotted with the well-documented and flexible ggparty package. The tree parameters can be passed to ggparty functions via the heat_tree() and draw_tree() functions of treeheatr. More details on different ggparty geoms can be found here.

  • The heatmap is shown with ggplot2::geom_tile(). The user may choose to cluster the samples within each leaf node or the features across all samples.

Make sure to check out the vignette for detailed information on the usage of treeheatr.

Please open an issue for questions related to treeheatr usage, bug reports or general inquiries.

Thank you very much for your support!

Copy Link

Version

Install

install.packages('treeheatr')

Monthly Downloads

194

Version

0.2.1

License

MIT + file LICENSE

Maintainer

Trang Le

Last Published

November 19th, 2020

Functions in treeheatr (0.2.1)

draw_heat

Draws the heatmap.
draw_tree

Draws the conditional decision tree.
prepare_feats

Prepares the feature dataframes for tiles.
prep_data

------------------------------------------------------------------------------------ Prepare dataset
clust_samp_func

Performs clustering of samples.
compute_tree

Compute decision tree from data set
heat_tree

Draws and aligns decision tree and heatmap.
eval_tree

Print decision tree performance according to different metrics.
prediction_df

Apply the predicted tree on either new test data or training data.
position_nodes

Creates smart node layout.
get_fit

------------------------------------------------------------------------------------ Get the fitted tree depending on the input `x`.
get_disp_feats

Select the important features to be displayed.
penguins

Data of three different species of penguins.
term_node_pos

Determines terminal node position.
wine_quality_red

Red variant of the Portuguese "Vinho Verde" wine.
test_covid

External test dataset. Medical information of Wuhan patients collected between 2020-01-10 and 2020-02-18.
align_plots

Align decision tree and heatmap:
clust_feat_func

Performs clustering or features.
wine

Results of a chemical analysis of wines grown in a specific area of Italy.
train_covid

Training dataset. Medical information of Wuhan patients collected between 2020-01-10 and 2020-02-18. Containing NAs.
galaxy

Galaxy dataset for regression.
get_cols

Get color functions from character vectors
scale_norm

Performs transformation on continuous variables.
print.ggHeatTree

Print a ggHeatTree object. Adopted from https://github.com/daattali/ggExtra/blob/master/R/ggMarginal.R#L207-L244.
diabetes

Diabetes patient records.