This function calculates the average value (or percentage of each level) of y for each level of x. It then builds a partition model taking y to be this average value (or percentage) with x being the predictor variable. The first split yields the "best" scheme for combining levels of x into 2 values. The second split yields the "best" scheme for combining levels of x into 3 values, etc.
The argument maxlevels
specifies the maximum numbers of levels in the combination scheme. By default, it will use the number of levels of x (ie, no combination). Setting this to a lower number saves time, since most likely a small number of combined levels is desired. This is useful for seeing how different combination schemes compare.
The argument target
will force the algorithm to producing exactly this number of combined levels. This is useful once you have determined how many levels of x you want.
If recode
is FALSE
, a table showing the combined levels along with the "BIC" of the combination scheme (lower is better, but a difference of around 4 or less is negligible). The suggested combination will be the fewer number of levels which has as BIC no more than 4 above the scheme that gave the lowest BIC.
If recode
is TRUE
, a list of three elements is produced. $Conversion1
gives a table of the Old and New levels alphabetized by Old while $Conversion2
gives a table of the Old and New levels alphabized by New. $newlevels
gives a factor of the cases levels under the new combination scheme. If target
is not set, it will use the suggested number of levels.