Alternatively, for non-vector, multivariate distributions (e.g., matrix-valued, Wishart),
Covariance shall return a (batch of) matrices under some vectorization of the events, i.e.,
Cov[i, j] = Covariance(Vec(X)_i, Vec(X)_j) = [as above]
where Cov is a (batch of) k x k matrices, 0 <= (i, j) < k = reduce_prod(event_shape),
and Vec is some function mapping indices of this distribution's event dimensions to indices of a
length-k vector.