oblique.tree (version 1.1.1)

oblique.tree: Fit an Oblique Tree to Classification Data

Description

An oblique tree is grown by binary recursive partitioning using the response in the specified formula with oblique splits composed of linear combinations of terms from the right-hand-side.

Usage

oblique.tree( formula, data, subset, control = tree.control(nobs, ...), method = "recursive.partition", split.impurity = c("deviance", "gini"), model = FALSE, oblique.splits = c("only", "on", "off"), variable.selection = c( "none", "model.selection.aic", "model.selection.bic", "lasso.aic", "lasso.bic"), ...)

Arguments

formula
A formula expression. The left-hand-side (response) should be a factor. The right-hand-side should be a series of numeric or factor variables separated by + (there should be no interaction terms). Both . and - are allowed.
data
A data frame in which to interpret formula and subset.
subset
An expression specifying the subset of cases to be used.
control
A list as returned by tree.control.
method
A character string specifying the method to use. The only other useful value is "model.frame".
split.impurity
Splitting criterion to use.
model
A model frame containing a response and predictors that can be used in place of formula, data and subset to allow direct specification of the problem.
oblique.splits
If and how oblique splits should be used during tree-growth. only grows trees that only consider oblique splits, on grows those that consider oblique and axis-parallel splits simultaneously and off grows trees that only consider axis-parallel splits.
variable.selection
If and how concise oblique splits should be found during tree-growth. none grows oblique trees using full oblique splits, model.selection.aic performs variable selection with AIC from the full model upon the best ideal split, model.selection.bic similarly for BIC, lasso.aic applies L1 regularization from the full model and chooses the penalization parameter with AIC and lasso.bic similarly for BIC.
...
Additional arguments that are passed to tree.control. Normally used for mincut, minsize or mindev.

Value

An object of class c("oblique.tree","tree") is returned with components
frame
A data frame with a row for each node and row.names giving the node numbers. The columns include var, the variable used to perform each split (where "" denotes an oblique split and "" a terminal node), n, the number of cases reaching that node, dev the deviance of the node, yval, the class associated to that node, split, a two-column matrix of the labels for left and right splits at the node and yprob, a matrix of fitted probabilities for each response level.
where
A vector indicating the row number of the frame detailing the node to which each case is assigned.
terms
The terms of the formula.
call
The matched call to oblique.tree.
y
Predicted classes of each observation by the tree (this differs from the implementation in the tree package where y is instead used to denote the actual classes).
A tree with no splits is of class "singlenode" which inherits from class "tree".

Details

An oblique tree is grown by binary recursive partitioning using the response in the specified formula and by choosing splits composed of terms from the right-hand-side. Where categorical attributes are considered levels of unordered factors are divided into two non-empty groups, where axis-parallel splits are considered numeric variables are divided into $X < a$ and $X \geq a$ and where oblique splits are considered numeric variables are divided into $\sum aX < c$ and $\sum aX \geq c$. The split that maximizes the reduction in impurity is chosen and the process repeated. Splitting continues until the terminal nodes are either pure, sufficiently pure or too small to be split.

When growing oblique trees, $2^{R-1}-1$ logistic regression problems need to be (where $R$ is the number of residual classes at a node). If observations come from more than 20 classes this approach will be slow.

References

Truong. A (2009) Fast Growing and Interpretable Oblique Trees via Probabilistic Models

Ripley, B. D. (1996). Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge. Chapter 7.

See Also

tree.control in the tree package, predict.oblique.tree, prune.oblique.tree, trim.oblique.tree

Examples

Run this code
#create the augmented crabs dataset
data(crabs, package = "MASS")
aug.crabs.data <- data.frame(	g=factor(rep(1:4,each=50)),
				predict(princomp(crabs[,4:8]))[,2:3])

plot(	aug.crabs.data[,-1],type="n")
text(	aug.crabs.data[,-1],
	col=as.numeric(aug.crabs.data[,1]),
	labels=as.numeric(aug.crabs.data[,1]))

#grow a full oblique tree
ob.tree <- oblique.tree(formula		= g~.,
			data		= aug.crabs.data,
			oblique.splits	= "only")
plot(ob.tree);text(ob.tree)

Run the code above in your browser using DataLab