For all functions except ltappareto
, arguments lambda
and theta
can either be scalars or vectors of the same length as x
, p
, or q
. If a scalar, then this value is assumed to hold over all cases. If a vector, then the values are assumed to have a one to one relationship with the values in x
, p
, or q
. The argument a
is a scalar.In the case of ltappareto
, all data
are assumed to be drawn from the same distribution and hence lambda
, theta
and a
are all scalars.
Let $Y$ be an exponential random variable with parameter $\lambda > 0$. Then the distribution function of $Y$ is
$$F_Y(y) = \Pr{Y < y } = 1 - \exp(-\lambda y),$$
and the density function is
$$f_Y(y) = \lambda \exp(-\lambda y).$$
Further, the mean and variance of the distribution of $Y$ is $1/\lambda$ and $1/\lambda^2$, respectively.
Now transform $Y$ as
$$X = a \exp(Y),$$
where $a>0$. Then $X$ is a Pareto random variable with shape parameter $\lambda$ and distribution function
$$F_X(x) = \Pr{X < x } = 1 - \left( \frac{a}{x} \right)^\lambda,$$
where $a \le x < \infty$, and density function
$$f_X(x) = \frac{\lambda}{a} \left( \frac{a}{x} \right)^{\lambda+1}.$$
We simulate the Pareto deviates by generating exponential deviates, and then transforming as described above.
As above, let $X$ be Pareto with shape parameter $\lambda$, and define $W - a$ to be exponential with parameter $1/\theta$, i.e.
$$\Pr{X > x} = \left( \frac{a}{x} \right)^\lambda$$
and
$$\Pr{W > w} = \exp\left( \frac{a - w}{\theta} \right),$$
where $a \le w < \infty$. Say we sample one independent value from each of the distributions $X$ and $W$, then
$$\Pr{X > z\ \&\ W > z} = \Pr{X > z} \Pr{ W > z} = \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).$$
We say that $Z$ has a tapered Pareto distribution if it has the above distribution, i.e.
$$F_Z(z) = \Pr{Z < z} = 1- \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).$$
The above relationship shows that a tapered Pareto deviate can be simulated by generating independent values of $X$ and $W$, and then letting $Z = \min(X, W)$. This minimum has the effect of tapering the tail of the Pareto distribution.
The tapered Pareto variable $Z$ has density
$$f_Z(z) = \left( \frac{\lambda}{z} + \frac{1}{\theta} \right) \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).$$
Given a sample of data $z_1, z_2, \cdots, z_n$, we write the log-likelihood as
$$\log L = \sum_{i=1}^n \log f_Z(z_i).$$
Hence the gradients are calculated as
$$\frac{\partial \log L}{\partial \lambda} = \theta \sum_{i=1}^n \frac{1}{\lambda \theta + z_i} - \sum_{i=1}^n \log(z_i/a)$$
and
$$\frac{\partial \log L}{\partial \theta} = \frac{-1}{\theta} \sum_{i=1}^n \frac{z_i}{\lambda \theta + z_i} - \frac{1}{\theta^2} \sum_{i=1}^n (a - z_i).$$
Further, the Hessian is calculated using
$$\frac{\partial^2 \log L}{\partial \lambda^2} = -\theta^2 \sum_{i=1}^n \frac{1}{(\lambda \theta + z_i)^2},$$
$$\frac{\partial^2 \log L}{\partial \theta^2} = \frac{1}{\theta^2} \sum_{i=1}^n \frac{z_i(2\lambda\theta + z_i)}{(\lambda \theta + z_i)^2} - \frac{2}{\theta^3} \sum_{i=1}^n (a - z_i),$$
and
$$\frac{\partial^2 \log L}{\partial \theta \, \partial \lambda} = \frac{\partial^2 \log L}{\partial \lambda \, \partial \theta} = \sum_{i=1}^n \frac{z_i}{(\lambda \theta + z_i)^2}.$$
See the section Seismological Context (below), which outlines its application in Seismology.