L0TFinv-package: A package for L0-regularized sparse approximation

Description

Trend filtering is a typical method for nonparametric regression. The commonly used trend filtering models is the L1 trend filtering model $(a)$ based on the difference matrix $\boldsymbol{D}^{(q+1)}$, as illustrated below. $$\min _{\boldsymbol{\alpha} \in \mathbb{R}^n} \frac{1}{2}\|\boldsymbol{y}-\boldsymbol{\alpha}\|_2^2 + \lambda\|\boldsymbol{D}^{(q+1)} \boldsymbol{\alpha}\|_{\ell_1}, \quad q=0,1,2, \ldots. \quad (a) $$ L0 trend filtering $(b)$ has a advantage over other trend filtering methods, especially in the detection of change points. The expression for L0 trend filtering is as follows: $$\min _{\boldsymbol{\alpha} \in \mathbb{R}^n} \frac{1}{2}\|\boldsymbol{y}-\boldsymbol{\alpha}\|_2^2 + \lambda\|\boldsymbol{D}^{(q+1)} \boldsymbol{\alpha}\|_{\ell_0}. \quad (b) $$ We explore transforming the problem $(b)$ into a L0-regularized sparse format $(c)$ by introducing an artificial design matrix $\boldsymbol{X}^{(q+1)}$ that corresponds to the difference matrix, thereby reformulating the L0 trend filtering problem into the following format. $$\min _{\boldsymbol{\beta} \in \mathbb{R}^n} \frac{1}{2}\|\boldsymbol{y}-\boldsymbol{X}^{(q+1)}\boldsymbol{\beta}\|_2^2 + \lambda \sum_{i=q+2}^n |\boldsymbol{\beta}_i|_{\ell_0}. \quad (c) $$ In our practical approach, we consider the maximum number of change points $k_{\text{max}}$ as a constraint, transforming the aforementioned L0 penalty problem $(c)$ into the following L0 constraint problem. $$\text{ minimize }\frac{1}{2}\|\boldsymbol{y}-\boldsymbol{X}^{(q+1)}\boldsymbol{\beta}\|_2^2,\quad \text{ subject to } \sum_{i=q+2}^n |\boldsymbol{\beta}_i|_{\ell_0} \leq k_{\text{max}}. \quad (d)$$ For such L0 constraint problems $(d)$, we employ a splicing-based approach to design algorithms for processing. This package has the following seven main methods:

matrix with special structure: $\quad$Generate $\boldsymbol{X}^{(q+1)}$ or $\boldsymbol{D}^{(q+1)}$ matrix.
inverse of the crossprod matrix: $\quad$Simplify the calculation of the inverse matrix of $(\boldsymbol{X}^{(q+1)}_A)^T \boldsymbol{X}^{(q+1)}_A$ for the cases where $q=0$ or $q=1$, which is frequently used in splicing algorithms.
inverse L0 trend filtering with fixed change points: $\quad$Fit a piecewise constant or piecewise linear estimated trend with a given number of change points.
inverse L0 trend filtering with optimal change points: $\quad$Fit a piecewise constant or piecewise linear estimated trend with a maximum number of change points, and select the optimal estimated trend using appropriate information criteria.
simulated data: $\quad$Generate piecewise constant or piecewise linear data.
print/coef: $\quad$Print a summary of the trend estimation results.
plot: $\quad$Plot a summary of the trend estimation results.

Arguments

Details

_PACKAGE

In previous studies, algorithms solving trend filtering problems $(a)$ necessitate the computation of $((\boldsymbol{D}^{(q+1)})^T \boldsymbol{D}^{(q+1)})^{-1}$. When $n$ is large, just fitting the matrix into memory becomes an issue.
In L0 trend filtering $(b)$, the positions of non-zero elements in the L0 norm correspond with the locations of change points. We consider two subsets: the active set $A$ for non-zero elements and the inactive set $I$ for zero elements. Despite this, computing $((\boldsymbol{D}^{(q+1)}_I)^T \boldsymbol{D}^{(q+1)}_I)^{-1}$ remains a task involving a substantial matrix.
Due to the connection between L0 constraint problems and L0 penalty problems, and considering that the sparsity of $\boldsymbol{\beta}$ is is more meaningful in practical applications than the selection of the hyperparameter $\lambda$. We focus on the constraint that reflects our aim to achieve an estimated trend with a given number of change points. So we transform the L0 penalty problem $(c)$ into the L0 constraint problem $(d)$.

References

Kim SJ, Koh K, Boyd SP and Gorinevsky DM. L1 Trend Filtering. Society for Industrial and Applied Mathematics (2009).

Wen C, Wang X and Zhang A. L0 Trend Filtering. INFORMS Journal on Computing (2023).