# HellingerDist

##### Generic function for the computation of the Hellinger distance of two distributions

Generic function for the computation of the Hellinger distance \(d_h\) of two distributions \(P\) and \(Q\) which may be defined for an arbitrary sample space \((\Omega,{\cal A})\). The Hellinger distance is defined as $$d_h(P,Q)=\frac{1}{2}\int|\sqrt{dP}\,-\sqrt{dQ}\,|^2$$ where \(\sqrt{dP}\), respectively \(\sqrt{dQ}\) denotes the square root of the densities.

- Keywords
- distribution

##### Usage

```
HellingerDist(e1, e2, ...)
# S4 method for AbscontDistribution,AbscontDistribution
HellingerDist(e1,e2,
rel.tol=.Machine$double.eps^0.3,
TruncQuantile = getdistrOption("TruncQuantile"),
IQR.fac = 15, ..., diagnostic = FALSE)
# S4 method for AbscontDistribution,DiscreteDistribution
HellingerDist(e1,e2, ...)
# S4 method for DiscreteDistribution,AbscontDistribution
HellingerDist(e1,e2, ...)
# S4 method for DiscreteDistribution,DiscreteDistribution
HellingerDist(e1,e2, ...)
# S4 method for numeric,DiscreteDistribution
HellingerDist(e1, e2, ...)
# S4 method for DiscreteDistribution,numeric
HellingerDist(e1, e2, ...)
# S4 method for numeric,AbscontDistribution
HellingerDist(e1, e2, asis.smooth.discretize = "discretize",
n.discr = getdistrExOption("nDiscretize"), low.discr = getLow(e2),
up.discr = getUp(e2), h.smooth = getdistrExOption("hSmooth"),
rel.tol=.Machine$double.eps^0.3,
TruncQuantile = getdistrOption("TruncQuantile"),
IQR.fac = 15, ..., diagnostic = FALSE)
# S4 method for AbscontDistribution,numeric
HellingerDist(e1, e2, asis.smooth.discretize = "discretize",
n.discr = getdistrExOption("nDiscretize"), low.discr = getLow(e1),
up.discr = getUp(e1), h.smooth = getdistrExOption("hSmooth"),
rel.tol=.Machine$double.eps^0.3,
TruncQuantile = getdistrOption("TruncQuantile"),
IQR.fac = 15, ..., diagnostic = FALSE)
# S4 method for AcDcLcDistribution,AcDcLcDistribution
HellingerDist(e1,e2,
rel.tol=.Machine$double.eps^0.3,
TruncQuantile = getdistrOption("TruncQuantile"),
IQR.fac = 15, ..., diagnostic = FALSE)
```

##### Arguments

- e1
object of class

`"Distribution"`

or class`"numeric"`

- e2
object of class

`"Distribution"`

or class`"numeric"`

- asis.smooth.discretize
possible methods are

`"asis"`

,`"smooth"`

and`"discretize"`

. Default is`"discretize"`

.- n.discr
if

`asis.smooth.discretize`

is equal to`"discretize"`

one has to specify the number of lattice points used to discretize the abs. cont. distribution.- low.discr
if

`asis.smooth.discretize`

is equal to`"discretize"`

one has to specify the lower end point of the lattice used to discretize the abs. cont. distribution.- up.discr
if

`asis.smooth.discretize`

is equal to`"discretize"`

one has to specify the upper end point of the lattice used to discretize the abs. cont. distribution.- h.smooth
if

`asis.smooth.discretize`

is equal to`"smooth"`

-- i.e., the empirical distribution of the provided data should be smoothed -- one has to specify this parameter.- rel.tol
relative accuracy requested in integration

- TruncQuantile
Quantile the quantile based integration bounds (see details)

- IQR.fac
Factor for the scale based integration bounds (see details)

- …
further arguments to be used in particular methods -- (in package distrEx: just used for distributions with a.c. parts, where it is used to pass on arguments to

`distrExIntegrate`

).- diagnostic
logical; if

`TRUE`

, the return value obtains an attribute`"diagnostic"`

with diagnostic information on the integration, i.e., a list with entries`method`

(`"integrate"`

or`"GLIntegrate"`

),`call`

,`result`

(the complete return value of the method),`args`

(the args with which the method was called), and`time`

(the time to compute the integral).

##### Details

For distances between absolutely continuous distributions, we use numerical
integration; to determine sensible bounds we proceed as follows:
by means of `min(getLow(e1,eps=TruncQuantile),getLow(e2,eps=TruncQuantile))`

,
`max(getUp(e1,eps=TruncQuantile),getUp(e2,eps=TruncQuantile))`

we determine
quantile based bounds `c(low.0,up.0)`

, and by means of
`s1 <- max(IQR(e1),IQR(e2));`

`m1<- median(e1);`

`m2 <- median(e2)`

and `low.1 <- min(m1,m2)-s1*IQR.fac`

, `up.1 <- max(m1,m2)+s1*IQR.fac`

we determine scale based bounds; these are combined by
`low <- max(low.0,low.1)`

, `up <- max(up.0,up1)`

.

In case we want to compute the Hellinger distance between (empirical) data
and an abs. cont. distribution, we can specify the parameter `asis.smooth.discretize`

to avoid trivial distances (distance = 1).

Using `asis.smooth.discretize = "discretize"`

, which is the default,
leads to a discretization of the provided abs. cont. distribution and
the distance is computed between the provided data and the discretized
distribution.

Using `asis.smooth.discretize = "smooth"`

causes smoothing of the
empirical distribution of the provided data. This is, the empirical
data is convoluted with the normal distribution `Norm(mean = 0, sd = h.smooth)`

which leads to an abs. cont. distribution. Afterwards the distance
between the smoothed empirical distribution and the provided abs. cont.
distribution is computed.

Diagnostics on the involved integrations are available if argument
`diagnostic`

is `TRUE`

. Then there is attribute `diagnostic`

attached to the return value, which may be inspected
and accessed through `showDiagnostic`

and
`getDiagnostic`

.

##### Value

Hellinger distance of `e1`

and `e2`

##### Methods

- e1 = "AbscontDistribution", e2 = "AbscontDistribution":
Hellinger distance of two absolutely continuous univariate distributions which is computed using

`distrExintegrate`

.- e1 = "AbscontDistribution", e2 = "DiscreteDistribution":
Hellinger distance of absolutely continuous and discrete univariate distributions (are mutually singular; i.e., have distance

`=1`

).- e1 = "DiscreteDistribution", e2 = "DiscreteDistribution":
Hellinger distance of two discrete univariate distributions which is computed using

`support`

and`sum`

.- e1 = "DiscreteDistribution", e2 = "AbscontDistribution":
Hellinger distance of discrete and absolutely continuous univariate distributions (are mutually singular; i.e., have distance

`=1`

).- e1 = "numeric", e2 = "DiscreteDistribution":
Hellinger distance between (empirical) data and a discrete distribution.

- e1 = "DiscreteDistribution", e2 = "numeric":
Hellinger distance between (empirical) data and a discrete distribution.

- e1 = "numeric", e2 = "AbscontDistribution":
Hellinger distance between (empirical) data and an abs. cont. distribution.

- e1 = "AbscontDistribution", e1 = "numeric":
Hellinger distance between (empirical) data and an abs. cont. distribution.

- e1 = "AcDcLcDistribution", e2 = "AcDcLcDistribution":
Hellinger distance of mixed discrete and absolutely continuous univariate distributions.

##### References

Huber, P.J. (1981) *Robust Statistics*. New York: Wiley.

Rieder, H. (1994) *Robust Asymptotic Statistics*. New York: Springer.

##### See Also

`distrExIntegrate`

, `ContaminationSize`

,
`TotalVarDist`

, `KolmogorovDist`

,
`Distribution-class`

##### Examples

```
# NOT RUN {
HellingerDist(Norm(), UnivarMixingDistribution(Norm(1,2),Norm(0.5,3),
mixCoeff=c(0.2,0.8)))
HellingerDist(Norm(), Td(10))
HellingerDist(Norm(mean = 50, sd = sqrt(25)), Binom(size = 100)) # mutually singular
HellingerDist(Pois(10), Binom(size = 20))
x <- rnorm(100)
HellingerDist(Norm(), x)
HellingerDist(x, Norm(), asis.smooth.discretize = "smooth")
y <- (rbinom(50, size = 20, prob = 0.5)-10)/sqrt(5)
HellingerDist(y, Norm())
HellingerDist(y, Norm(), asis.smooth.discretize = "smooth")
HellingerDist(rbinom(50, size = 20, prob = 0.5), Binom(size = 20, prob = 0.5))
# }
```

*Documentation reproduced from package distrEx, version 2.8.0, License: LGPL-3*