Learn R Programming

PrInDT (version 2.0.1)

PrInDTregAll: Regression tree based on all observations

Description

Regression tree based on the full sample; interpretability is checked (see 'ctestv').
The relationship between the target variable 'regname' and all other factor and numerical variables in the data frame 'datain' is modeled based on all observations.
The parameters 'conf.level', 'minsplit', and 'minbucket' can be used to control the size of the trees.
Besides the maximal R2, the minimal MAE (Mean Absolute Error) is reported.

Usage

PrInDTregAll(datain,regname,ctestv=NA,conf.level=0.95,minsplit=NA,minbucket=NA)

Value

treeall

tree based on all observations

R2All

goodness of fit of 'treeall' based on all observations

MAEAll

MAE of 'treeall' based on all observations

interpAll

criterion of interpretability of 'treeall' (TRUE / FALSE)

Arguments

datain

Input data frame with class factor variable 'classname' and the
influential variables, which need to be factors or numericals (transform logicals and character variables to factors)

regname

name of regressand variable (character)

ctestv

Vector of character strings of forbidden split results;
(see function PrInDT for details.)
If no restrictions exist, the default = NA is used.

conf.level

(1 - significance level) in function ctree (numerical, > 0 and <= 1);
default = 0.95

minsplit

Minimum number of elements in a node to be splitted;
default = 20

minbucket

Minimum number of elements in a node;
default = 7

Details

Standard output can be produced by means of print(name) or just name as well as plot(name) where 'name' is the output data frame of the function.

Examples

Run this code
data <- PrInDT::data_vowel
data <- na.omit(data)
ctestv <- 'vowel_maximum_pitch <= 320'
outreg <- PrInDTregAll(data,"target",ctestv)
outreg
plot(outreg)

Run the code above in your browser using DataLab