This is the function to understand the model trained (and through your model, your data).Results are returned for both linear and tree models.
data.table is returned by the function.
There are 3 columns :
Featuresname of the features as provided infeature_namesor already present in the model dump.Gaincontribution of each feature to the model. For boosted tree model, each gain of each feature of each tree is taken into account, then average per feature to give a vision of the entire model. Highest percentage means important feature to predict thelabelused for the training ;Covermetric of the number of observation related to this feature (only available for tree models) ;Weightpercentage representing the relative number of times a feature have been taken into trees.Gainshould be prefered to search the most important feature. For boosted linear model, this column has no meaning.
Co-occurence count
------------------
The gain gives you indication about the information of how a feature is important in making a branch of a decision tree more pure. However, with this information only, you can't know if this feature has to be present or not to get a specific classification. In the example code, you may wonder if odor=none should be TRUE to not eat a mushroom.
Co-occurence computation is here to help in understanding this relation between a predictor and a specific class. It will count how many observations are returned as TRUE by the target function (see parameters). When you execute the example below, there are 92 times only over the 3140 observations of the train dataset where a mushroom have no odor and can be eaten safely.
If you need to remember one thing only: until you want to leave us early, don't eat a mushroom which has no odor :-)