Function for fitting adaptive three operator splitting (ATOS) with general convex penalties. Supports both linear and logistic regression, both with dense and sparse matrix implementations.
atos(
X,
y,
type = "linear",
prox_1,
prox_2,
pen_prox_1 = 0.5,
pen_prox_2 = 0.5,
max_iter = 5000,
backtracking = 0.7,
max_iter_backtracking = 100,
tol = 1e-05,
prox_1_opts = NULL,
prox_2_opts = NULL,
standardise = "l2",
intercept = TRUE,
x0 = NULL,
u = NULL,
verbose = FALSE
)
An object of class "atos"
containing:
The fitted values from the regression. Taken to be the more stable fit between x
and u
, which is usually the former.
The solution to the original problem (see Pedregosa and Gidel (2018)).
The solution to the dual problem (see Pedregosa and Gidel (2018)).
The updated values from applying the first proximal operator (see Pedregosa and Gidel (2018)).
Indicates which type of regression was performed.
Logical flag indicating whether ATOS converged, according to tol
.
Number of iterations performed. If convergence is not reached, this will be max_iter
.
Final value of convergence criteria.
Logical flag indicating whether an intercept was fit.
Input matrix of dimensions \(n \times p\). Can be a sparse matrix (using class "sparseMatrix"
from the Matrix
package)
Output vector of dimension \(n\). For type="linear"
needs to be continuous and for type="logistic"
needs to be a binary variable.
The type of regression to perform. Supported values are: "linear"
and "logistic"
.
The proximal operator for the first function, \(h(x)\).
The proximal operator for the second function, \(g(x)\).
The penalty for the first proximal operator. For the lasso, this would be the sparsity parameter, \(\lambda\). If operator does not include a penalty, set to 1.
The penalty for the second proximal operator.
Maximum number of ATOS iterations to perform.
The backtracking parameter, \(\tau\), as defined in Pedregosa and Gidel (2018).
Maximum number of backtracking line search iterations to perform per global iteration.
Convergence tolerance for the stopping criteria.
Optional argument for first proximal operator. For the group lasso, this would be the group IDs. Note: this must be inserted as a list.
Optional argument for second proximal operator.
Type of standardisation to perform on X
:
"l2"
standardises the input data to have \(\ell_2\) norms of one.
"l1"
standardises the input data to have \(\ell_1\) norms of one.
"sd"
standardises the input data to have standard deviation of one.
"none"
no standardisation applied.
Logical flag for whether to fit an intercept.
Optional initial vector for \(x_0\).
Optional initial vector for \(u\).
Logical flag for whether to print fitting information.
atos()
solves convex minimization problems of the form
$$
f(x) + g(x) + h(x),
$$
where \(f\) is convex and differentiable with \(L_f\)-Lipschitz gradient, and \(g\) and \(h\) are both convex.
The algorithm is not symmetrical, but usually the difference between variations are only small numerical values, which are filtered out.
However, both variations should be checked regardless, by looking at x
and u
. An example for the sparse-group lasso (SGL) is given.
Pedregosa, F., Gidel, G. (2018). Adaptive Three Operator Splitting, https://proceedings.mlr.press/v80/pedregosa18a.html