Algorithm Overview
The MOB algorithm executes in five sequential phases with strict monotonicity
enforcement integrated throughout:
Phase 1: Equal-Frequency Pre-binning
Initial bins are created by dividing sorted data into approximately equal-sized
groups:
$$n_{\text{bin}} = \left\lfloor \frac{N}{\min(\text{max\_n\_prebins}, n_{\text{unique}})} \right\rfloor$$
Bin boundaries are set to feature values at split points, ensuring no gaps between
consecutive bins. First and last boundaries are set to \(-\infty\) and \(+\infty\).
This approach balances statistical stability (sufficient observations per bin) with
granularity (ability to detect local patterns).
Phase 2: Rare Bin Merging
Bins with total count below bin_cutoff \(\times N\) are iteratively merged.
The merge direction (left or right) is chosen to minimize Information Value loss:
$$\text{direction} = \arg\min_{d \in \{\text{left}, \text{right}\}} \left( \text{IV}_{\text{before}} - \text{IV}_{\text{after merge}} \right)$$
where:
$$\text{IV}_{\text{before}} = \text{IV}_i + \text{IV}_{i+d}$$
$$\text{IV}_{\text{after}} = (\text{DistGood}_{\text{merged}} - \text{DistBad}_{\text{merged}}) \times \text{WoE}_{\text{merged}}$$
Merging continues until all bins meet the frequency threshold or min_bins
is reached.
Phase 3: Initial WoE/IV Calculation
Weight of Evidence for each bin \(i\) is computed with Laplace smoothing:
$$\text{WoE}_i = \ln\left(\frac{n_i^{+} + \alpha}{n^{+} + k\alpha} \bigg/ \frac{n_i^{-} + \alpha}{n^{-} + k\alpha}\right)$$
where \(\alpha = \text{laplace\_smoothing}\) and \(k\) is the current number
of bins. Information Value is:
$$\text{IV}_i = \left(\frac{n_i^{+} + \alpha}{n^{+} + k\alpha} - \frac{n_i^{-} + \alpha}{n^{-} + k\alpha}\right) \times \text{WoE}_i$$
Edge case handling:
If both distributions approach zero: \(\text{WoE}_i = 0\)
If only positive distribution is zero: \(\text{WoE}_i = -20\) (capped)
If only negative distribution is zero: \(\text{WoE}_i = +20\) (capped)
Phase 4: Monotonicity Enforcement
The algorithm first determines the desired monotonicity direction by examining the
relationship between the first two bins:
$$\text{should\_increase} = \begin{cases} \text{TRUE} & \text{if } \text{WoE}_1 \ge \text{WoE}_0 \\ \text{FALSE} & \text{otherwise} \end{cases}$$
For each bin \(i\) from 1 to \(k-1\), violations are detected as:
$$\text{violation} = \begin{cases} \text{WoE}_i < \text{WoE}_{i-1} & \text{if should\_increase} \\ \text{WoE}_i > \text{WoE}_{i-1} & \text{if } \neg\text{should\_increase} \end{cases}$$
When a violation is found at index \(i\), the algorithm attempts two merge strategies:
Merge with previous bin: Combine bins \(i-1\) and \(i\), then
verify the merged bin's WoE is compatible with neighbors:
$$\text{WoE}_{i-2} \le \text{WoE}_{\text{merged}} \le \text{WoE}_{i+1} \quad \text{(if should\_increase)}$$
Merge with next bin: If strategy 1 fails, merge bins \(i\) and \(i+1\).
Merging continues iteratively until either:
All WoE values satisfy monotonicity constraints
The number of bins reaches min_bins
max_iterations is exceeded (triggers warning)
After each merge, WoE and IV are recalculated for all bins to reflect updated
distributions.
Phase 5: Bin Count Reduction
If the number of bins exceeds max_bins after monotonicity enforcement,
additional merges are performed. The algorithm identifies the pair of adjacent bins
that minimizes IV loss when merged:
$$\text{merge\_idx} = \arg\min_{i=0}^{k-2} \left( \text{IV}_i + \text{IV}_{i+1} - \text{IV}_{\text{merged}} \right)$$
This greedy approach continues until \(k \le \text{max\_bins}\).
Theoretical Foundations
Monotonicity as Stability Criterion: Zeng (2014) proves that
non-monotonic WoE patterns are unstable under population drift, leading to
unreliable predictions when the data distribution shifts.
Regulatory Compliance: Basel II/III validation requirements
(BCBS, 2005) explicitly require monotonic relationships between risk drivers
and probability of default for IRB models.
Information Preservation: While enforcing monotonicity reduces
model flexibility, Mironchyk & Tchistiakov (2017) demonstrate that the IV
loss is typically < 5% compared to unconstrained binning for real credit
portfolios.
Comparison with Related Methods
| Method | Monotonicity | Enforcement | Use Case |
| MOB | Guaranteed | During optimization | Regulatory scorecards |
| MBLP | Target | Iterative post-process | General credit models |
| MDLP | Optional | Post-hoc merging | Exploratory analysis |
| LDB | Optional | Post-hoc merging | Research/prototyping |
Computational Complexity
Sorting: \(O(N \log N)\)
Pre-binning: \(O(N)\)
Rare bin merging: \(O(k^2 \times I_{\text{rare}})\) where \(I_{\text{rare}}\)
is the number of rare bins
Monotonicity enforcement: \(O(k^2 \times I_{\text{mono}})\) where
\(I_{\text{mono}}\) is the number of violations (worst case: \(O(k)\))
Bin reduction: \(O(k \times (k_{\text{initial}} - \text{max\_bins}))\)
Total: \(O(N \log N + k^2 \times \text{max\_iterations})\)
For typical credit scoring datasets (\(N \sim 10^5\), \(k \sim 5\)), runtime
is dominated by sorting. Pathological cases (e.g., perfectly alternating WoE values)
may require \(O(k^2)\) merges.