R^k
This distribution models a random vector Y = (Y1,...,Yk)
, making use of
a SinhArcsinh transformation (which has adjustable tailweight and skew),
a rescaling, and a shift.
The SinhArcsinh transformation of the Normal is described in great depth in
Sinh-arcsinh distributions.
Here we use a slightly different parameterization, in terms of tailweight
and skewness. Additionally we allow for distributions other than Normal,
and control over scale as well as a "shift" parameter loc.
tfd_vector_sinh_arcsinh_diag(
loc = NULL,
scale_diag = NULL,
scale_identity_multiplier = NULL,
skewness = NULL,
tailweight = NULL,
distribution = NULL,
validate_args = FALSE,
allow_nan_stats = TRUE,
name = "VectorSinhArcsinhDiag"
)
Floating-point Tensor. If this is set to NULL, loc is
implicitly 0. When specified, may have shape [B1, ..., Bb, k]
where
b >= 0 and k is the event size.
Non-zero, floating-point Tensor representing a diagonal
matrix added to scale. May have shape [B1, ..., Bb, k]
, b >= 0,
and characterizes b-batches of k x k diagonal matrices added to
scale. When both scale_identity_multiplier and scale_diag are
NULL then scale is the Identity.
Non-zero, floating-point Tensor representing
a scale-identity-matrix added to scale. May have shape
[B1, ..., Bb]
, b >= 0, and characterizes b-batches of scale
k x k identity matrices added to scale. When both
scale_identity_multiplier and scale_diag are NULL then scale
is the Identity.
Skewness parameter. floating-point Tensor with shape broadcastable with event_shape.
Tailweight parameter. floating-point Tensor with shape broadcastable with event_shape.
tf$distributions$Distribution
-like instance. Distribution from which k
iid samples are used as input to transformation F. Default is
tfd_normal(loc = 0, scale = 1).
Must be a scalar-batch, scalar-event distribution. Typically
distribution$reparameterization_type = FULLY_REPARAMETERIZED or it is
a function of non-trainable parameters. WARNING: If you backprop through
a VectorSinhArcsinhDiag sample and distribution is not
FULLY_REPARAMETERIZED yet is a function of trainable variables, then
the gradient will be incorrect!
Logical, default FALSE. When TRUE distribution parameters are checked for validity despite possibly degrading runtime performance. When FALSE invalid inputs may silently render incorrect outputs. Default value: FALSE.
Logical, default TRUE. When TRUE, statistics (e.g., mean, mode, variance) use the value NaN to indicate the result is undefined. When FALSE, an exception is raised if one or more of the statistic's batch members are undefined.
name prefixed to Ops created by this class.
a distribution instance.
Mathematical Details
Given iid random vector Z = (Z1,...,Zk)
, we define the VectorSinhArcsinhDiag
transformation of Z
, Y
, parameterized by
(loc, scale, skewness, tailweight)
, via the relation (with @
denoting matrix multiplication):
Y := loc + scale @ F(Z) * (2 / F_0(2)) F(Z) := Sinh( (Arcsinh(Z) + skewness) * tailweight ) F_0(Z) := Sinh( Arcsinh(Z) * tailweight )
This distribution is similar to the location-scale transformation
L(Z) := loc + scale @ Z
in the following ways:
If skewness = 0
and tailweight = 1
(the defaults), F(Z) = Z
, and then
Y = L(Z)
exactly.
loc
is used in both to shift the result by a constant factor.
The multiplication of scale
by 2 / F_0(2)
ensures that if skewness = 0
P[Y - loc <= 2 * scale] = P[L(Z) - loc <= 2 * scale]
.
Thus it can be said that the weights in the tails of Y
and L(Z)
beyond
loc + 2 * scale
are the same.
This distribution is different than loc + scale @ Z
due to the
reshaping done by F
:
Positive (negative) skewness
leads to positive (negative) skew.
positive skew means, the mode of F(Z)
is "tilted" to the right.
positive skew means positive values of F(Z)
become more likely, and
negative values become less likely.
Larger (smaller) tailweight
leads to fatter (thinner) tails.
Fatter tails mean larger values of |F(Z)|
become more likely.
tailweight < 1
leads to a distribution that is "flat" around Y = loc
,
and a very steep drop-off in the tails.
tailweight > 1
leads to a distribution more peaked at the mode with
heavier tails.
To see the argument about the tails, note that for |Z| >> 1
and
|Z| >> (|skewness| * tailweight)**tailweight
, we have
Y approx 0.5 Z**tailweight e**(sign(Z) skewness * tailweight)
.
To see the argument regarding multiplying scale
by 2 / F_0(2)
,
P[(Y - loc) / scale <= 2] = P[F(Z) * (2 / F_0(2)) <= 2] = P[F(Z) <= F_0(2)] = P[Z <= 2] (if F = F_0).
For usage examples see e.g. tfd_sample()
, tfd_log_prob()
, tfd_mean()
.
Other distributions:
tfd_autoregressive()
,
tfd_batch_reshape()
,
tfd_bates()
,
tfd_bernoulli()
,
tfd_beta_binomial()
,
tfd_beta()
,
tfd_binomial()
,
tfd_categorical()
,
tfd_cauchy()
,
tfd_chi2()
,
tfd_chi()
,
tfd_cholesky_lkj()
,
tfd_continuous_bernoulli()
,
tfd_deterministic()
,
tfd_dirichlet_multinomial()
,
tfd_dirichlet()
,
tfd_empirical()
,
tfd_exp_gamma()
,
tfd_exp_inverse_gamma()
,
tfd_exponential()
,
tfd_gamma_gamma()
,
tfd_gamma()
,
tfd_gaussian_process_regression_model()
,
tfd_gaussian_process()
,
tfd_generalized_normal()
,
tfd_geometric()
,
tfd_gumbel()
,
tfd_half_cauchy()
,
tfd_half_normal()
,
tfd_hidden_markov_model()
,
tfd_horseshoe()
,
tfd_independent()
,
tfd_inverse_gamma()
,
tfd_inverse_gaussian()
,
tfd_johnson_s_u()
,
tfd_joint_distribution_named_auto_batched()
,
tfd_joint_distribution_named()
,
tfd_joint_distribution_sequential_auto_batched()
,
tfd_joint_distribution_sequential()
,
tfd_kumaraswamy()
,
tfd_laplace()
,
tfd_linear_gaussian_state_space_model()
,
tfd_lkj()
,
tfd_log_logistic()
,
tfd_log_normal()
,
tfd_logistic()
,
tfd_mixture_same_family()
,
tfd_mixture()
,
tfd_multinomial()
,
tfd_multivariate_normal_diag_plus_low_rank()
,
tfd_multivariate_normal_diag()
,
tfd_multivariate_normal_full_covariance()
,
tfd_multivariate_normal_linear_operator()
,
tfd_multivariate_normal_tri_l()
,
tfd_multivariate_student_t_linear_operator()
,
tfd_negative_binomial()
,
tfd_normal()
,
tfd_one_hot_categorical()
,
tfd_pareto()
,
tfd_pixel_cnn()
,
tfd_poisson_log_normal_quadrature_compound()
,
tfd_poisson()
,
tfd_power_spherical()
,
tfd_probit_bernoulli()
,
tfd_quantized()
,
tfd_relaxed_bernoulli()
,
tfd_relaxed_one_hot_categorical()
,
tfd_sample_distribution()
,
tfd_sinh_arcsinh()
,
tfd_skellam()
,
tfd_spherical_uniform()
,
tfd_student_t_process()
,
tfd_student_t()
,
tfd_transformed_distribution()
,
tfd_triangular()
,
tfd_truncated_cauchy()
,
tfd_truncated_normal()
,
tfd_uniform()
,
tfd_variational_gaussian_process()
,
tfd_vector_diffeomixture()
,
tfd_vector_exponential_diag()
,
tfd_vector_exponential_linear_operator()
,
tfd_vector_laplace_diag()
,
tfd_vector_laplace_linear_operator()
,
tfd_von_mises_fisher()
,
tfd_von_mises()
,
tfd_weibull()
,
tfd_wishart_linear_operator()
,
tfd_wishart_tri_l()
,
tfd_wishart()
,
tfd_zipf()