For a sample \(S\), with size \(n\) and inclusion probabilities \(\pi_i=P(i\in S)\) (argument Pi), derived from a finite population \(U\), with size \(N\), different formulations of the Gini index have been proposed in the literature. This function estimates the Gini index using various formulations, and both R and C++ codes are implemented. This can be useful for research purposes, and speed comparisons can be made. The different methods for estimating the Gini index are (see also Muñoz et al., 2023):
method = 1 (Langel and Tillé, 2013)
$$ \widehat{G}_{w1}= \displaystyle \frac{1}{2\widehat{N}^{2}\overline{y}_{w}}\sum_{i \in S}\sum_{j \in S}w_{i}w_{j}|y_{i}-y_{j}|,$$
where \(\widehat{N}=\sum_{i \in S}w_i\), \(\overline{y}_{w}=\widehat{N}^{-1}\sum_{i \in S}w_{i}y_{i}\), and \(w_i\) are the survey weights. For example, the survey weights can be \(w_i=\pi_{i}^{-1}\). w or Pi must be provided, but not both. It is required that \(w_i = \pi_i^{-1}\), for \(i \in S\), when both w and Pi are provided.
method = 2 (Alfons and Templ, 2012; Langel and Tillé, 2013)
$$ \widehat{G}_{w2} =\displaystyle \frac{2\sum_{i \in S}w_{(i)}^{*}\widehat{N}_{(i)}y_{(i)} - \sum_{i \in S}w_{i}^{2}y_{i} }{\widehat{N}^{2}\overline{y}_{w}}-1,$$
where \(y_{(i)}\) are the values \(y_i\) sorted in increasing order, \(w_{(i)}^{*}\) are the values \(w_i\) sorted according to the increasing order of the values \(y_i\), and \(\widehat{N}_{(i)}=\sum_{j=1}^{i}w_{(j)}^{*}\). Langel and Tillé (2013) show that \(\widehat{G}_{w1} = \widehat{G}_{w2}\).
method = 3 (Berger, 2008)
$$ \widehat{G}_{w3} = \displaystyle \frac{2}{\widehat{N}\overline{y}_{w}}\sum_{i \in S}w_{i}y_{i}\widehat{F}_{w}^{\ast}(y_{i})-1, $$
where
$$\widehat{F}_{w}^{\ast}(t) = \displaystyle \frac{1}{\widehat{N}}\sum_{i \in S}w_{i}[\delta(y_i < t) + 0.5\delta(y_i = t)] $$
is the smooth (mid-point) distribution function, and \(\delta(\cdot)\) is the indicator variable that takes the value 1 when its argument is true, and the value 0 otherwise. It can be seen that \(\widehat{G}_{w2} = \widehat{G}_{w3}\).
method = 4 (Berger and Gedik-Balay, 2020)
$$\widehat{G}_{w4} = 1 - \displaystyle \frac{\overline{z}_{w}}{\overline{y}_{w}},$$
where \(\overline{z}_{w}=\widehat{N}^{-1}\sum_{i \in S}w_{i}z_{i}\) and
$$z_{i} = \displaystyle \frac{1}{\widehat{N} - w_{i}}\sum_{ \substack{j \in S\\ j\neq i}}\min(y_{i},y_{j}).$$
method = 5 (Lerman and Yitzhaki, 1989)
$$\widehat{G}_{w5} = \displaystyle \frac{2}{\widehat{N}\overline{y}_{w}} \sum_{i \in S} w_{i}[y_{i} - \overline{y}_{w}]\left[ \widehat{F}_{w}^{LY}(y_{i}) - \overline{F}_{w}^{LY} \right], $$
where
$$\widehat{F}_{w}^{LY}(y_{i}) = \displaystyle \frac{1}{\widehat{N}}\left(\widehat{N}_{(i-1)} + \frac{w_{(i)}^{\ast}}{2} \right) $$
and \(\overline{F}_{w}^{LY}=\widehat{N}^{-1}\sum_{i \in S}w_{i}\widehat{F}_{w}^{LY}(y_{i})\).