The Sylvester flow uses two triangular matrices (R1
and R2
) and Householder reflections to construct invertible transformations.
The transformation is parameterized as follows:
$$z = Q R_1 h(Q^T R_2 zk + b) + zk,$$
where:
Q
is an orthogonal matrix obtained via Householder reflections.
R1
and R2
are upper triangular matrices with learned diagonal elements.
h
is a non-linear activation function (default: torch_tanh
).
b
is a learned bias vector.
The log determinant of the Jacobian is computed to ensure the invertibility of the transformation and is given by:
$$\log |det J| = \sum_{i=1}^d \log |diag_1[i] \cdot diag_2[i] \cdot h'(RQ^T zk + b) + 1|,$$
where diag_1
and diag_2
are the learned diagonal elements of R1
and R2
, respectively, and h\'
is the derivative of the activation function.