get_hdr_1d: Computing the highest density regions of a 1D density

Description

get_hdr_1d is used to estimate a 1-dimensional density and compute corresponding HDRs. The estimated density and HDRs are represented in a discrete form as a grid, defined by arguments range and n. get_hdr_1d is used internally by layer functions stat_hdr_rug() and stat_hdr_rug_fun().

Usage

get_hdr_1d(
  x = NULL,
  method = "kde",
  probs = c(0.99, 0.95, 0.8, 0.5),
  n = 512,
  range = NULL,
  hdr_membership = TRUE,
  fun,
  args = list()
)

Value

get_hdr_1d returns a list with elements df_est (data.frame), breaks (named numeric), and data (data.frame).

df_est: the estimated HDRs and density evaluated on the grid defined by range and n. The column of estimated HDRs (df_est$hdr) is a numeric vector with values from probs. The columns df_est$fhat and df_est$fhat_discretized correspond to the estimated density on the original scale and rescaled to sum to 1, respectively.
breaks: the heights of the estimated density (df_est$fhat) corresponding to the HDRs specified by probs. Will always have additional element Inf representing the cutoff for the 100% HDR.
data: the original data provided in the data argument. If hdr_membership is set to TRUE, this includes a column (data$hdr_membership) with the HDR corresponding to each data point.

Arguments

x: A vector of data
method: Either a character ("kde", "norm", "histogram", "freqpoly", or "fun") or method_*_1d() function. See the "The method argument" section below for details.
probs: Probabilities to compute HDRs for.
n: Resolution of grid representing estimated density and HDRs.
range: Range of grid representing estimated density and HDRs.
hdr_membership: Should HDR membership of data points (x) be computed?
fun: Optional, a probability density function, must be vectorized in its first argument. See the "The fun argument" section below for details.
args: Optional, a list of arguments to be provided to fun.

The <code>method</code> argument

The density estimator used to estimate the HDRs is specified with the method argument. The simplest way to specify an estimator is to provide a character value to method, for example method = "kde" specifies a kernel density estimator. However, this specification is limited to the default behavior of the estimator.

Instead, it is possible to provide a function call, for example: method = method_kde_1d(). This is slightly different from the function calls provided in get_hdr(), note the _1d suffix. In many cases, these functions accept parameters governing the density estimation procedure. Here, method_kde_1d() accepts several parameters related to the choice of kernel. For details, see ?method_kde_1d. Every method of univariate density estimation implemented has such corresponding method_*_1d() function, each with an associated help page.

Note: geom_hdr_rug() and other layer functions also have method arguments which behave in the same way. For more details on the use and implementation of the method_*_1d() functions, see vignette("method", "ggdensity").

The <code>fun</code> argument

If method is set to "fun", get_hdr_1d() expects a univariate probability density function to be specified with the fun argument. It is required that fun be a function of at least one argument (x). Beyond this first argument, fun can have arbitrarily many arguments; these can be set in get_hdr_1d() as a named list via the args parameter.

Note: get_hdr_1d() requires that fun be vectorized in x. For an example of an appropriate choice of fun, see the final example below.

Examples

Run this code

x <- rnorm(1e3)

# Two ways to specify `method`
get_hdr_1d(x, method = "kde")
get_hdr_1d(x, method = method_kde_1d())

if (FALSE) {

# If parenthesis are omitted, `get_hdr_1d()` errors
get_hdr_1d(df, method = method_kde_1d)

# If the `_1d` suffix is omitted, `get_hdr_1d()` errors
get_hdr_1d(x, method = method_kde())
}

# Adjust estimator parameters with arguments to `method_kde_1d()`
get_hdr_1d(x, method = method_kde_1d(kernel = "triangular"))

# Estimate different HDRs with `probs`
get_hdr_1d(x, method = method_kde_1d(), probs = c(.975, .6, .2))

# Compute "population" HDRs of specified univariate pdf with `method = "fun"`
f <- function(x, sd = 1) dnorm(x, sd = sd)
get_hdr_1d(method = "fun", fun = f, range = c(-5, 5))
get_hdr_1d(method = "fun", fun = f, range = c(-5, 5), args = list(sd = .5))

Run the code above in your browser using DataLab