forder: Functional ordering

Description

Calculates different measures for ordering the functions (or vectors) from the most extreme to least extreme one

Usage

forder(
  curve_sets,
  measure = "erl",
  scaling = "qdir",
  alternative = c("two.sided", "less", "greater"),
  use_theo = TRUE,
  probs = c(0.025, 0.975),
  quantile.type = 7,
  r_min = NULL,
  r_max = NULL
)

Arguments

curve_sets

A curve_set object or a list of curve_set objects.

measure

The measure to use to order the functions from the most extreme to the least extreme one. Must be one of the following: 'rank', 'erl', 'cont', 'area', 'max', 'int', 'int2'. Default is 'erl'.

scaling

The name of the scaling to use if measure is 'max', 'int' or 'int2'. Options include 'none', 'q', 'qdir' and 'st', where 'qdir' is the default.

alternative

A character string specifying the alternative hypothesis. Must be one of the following: "two.sided" (default), "less" or "greater". The last two options only available for types 'rank', 'erl', 'cont' and 'area'.

use_theo

Logical. When calculating the measures 'max', 'int', 'int2', should the theoretical function from curve_set be used (if 'theo' provided), see deviation_test.

probs

A two-element vector containing the lower and upper quantiles for the measure 'q' or 'qdir', in that order and on the interval [0, 1]. The default values are 0.025 and 0.975, suggested by Myllym<U+00E4>ki et al. (2015, 2017).

quantile.type

As type argument of quantile, how to calculate quantiles for 'q' or 'qdir'.

r_min

The minimum radius to include.

r_max

The maximum radius to include.

Value

A vector containing one of the above mentioned measures k for each of the functions in the curve set. If the component obs in the curve set is a vector, then its measure will be the first component (named 'obs') in the returned vector.

Details

Given a curve_set (see create_curve_set for how to create such an object) or an envelope object, which contains both the data curve (or function or vector) $T_1(r)$ and the simulated curves $T_2(r),\dots,T_{s+1}(r)$, the functions are ordered from the most extreme one to the least extreme one by one of the following measures (specified by the argument measure). Note that 'erl', 'cont' and 'area' were proposed as a refinement to the extreme ranks 'rank', because the extreme ranks can contain many ties. All of these completely non-parametric measures are smallest for the most extreme functions and largest for the least extreme ones, whereas the deviation measures ('max', 'int' and 'int2') obtain largest values for the most extreme functions.

'rank': extreme rank (Myllym<U+00E4>ki et al., 2017). The extreme rank $R_i$ is defined as the minimum of pointwise ranks of the curve $T_i(r)$, where the pointwise rank is the rank of the value of the curve for a specific r-value among the corresponding values of the s other curves such that the lowest ranks correspond to the most extreme values of the curves. How the pointwise ranks are determined exactly depends on the whether a one-sided (alternative is "less" or "greater") or the two-sided test (alternative="two.sided") is chosen, for details see Mrkvi<U+010D>ka et al. (2017, page 1241) or Mrkvi<U+010D>ka et al. (2018, page 6).
'erl': extreme rank length (Myllym<U+00E4>ki et al., 2017). Considering the vector of pointwise ordered ranks $\mathbf{R}_i$ of the ith curve, the extreme rank length measure $R_i^{erl}$ is equal to $$R_i^{erl} = \frac{1}{s+1}\sum_{j=1}^{s+1} \mathbf{1}(\mathbf{R}_j "<" \mathbf{R}_i)$$ where $\mathbf{R}_j "<" \mathbf{R}_i$ if and only if there exists $n\leq d$ such that for the first k, $k<n$, pointwise ordered ranks of $\mathbf{R}_j$ and $\mathbf{R}_i$ are equal and the n'th rank of $\mathbf{R}_j$ is smaller than that of $\mathbf{R}_i$.
'cont': continuous rank (Hahn, 2015; Mrkvi<U+010D>ka et al., 2019) based on minimum of continuous pointwise ranks
'area': area rank (Mrkvi<U+010D>ka et al., 2019) based on area between continuous pointwise ranks and minimum pointwise ranks for those argument (r) values for which pointwise ranks achieve the minimum (it is a combination of erl and cont)
'max' and 'int' and 'int2': Further options for the measure argument that can be used together with scaling. See the help in deviation_test for these options of measure and scaling. These measures are largest for the most extreme functions and smallest for the least extreme ones. The arguments use_theo and probs are relevant for these measures only (otherwise ignored).

References

Hahn U (2015). <U+201C>A note on simultaneous Monte Carlo tests.<U+201D> Technical report, Centre for Stochastic Geometry and advanced Bioimaging, Aarhus University.

Mrkvi<U+010D>ka, T., Hahn, U. and Myllym<U+00E4>ki, M. (2018). A one-way ANOVA test for functional data with graphical interpretation. arXiv:1612.03608 [stat.ME]

Mrkvi<U+010D>ka, T., Myllym<U+00E4>ki, M. and Narisetty, N. N. (2019) New methods for multiple testing in permutation inference for the general linear model. arXiv:1906.09004 [stat.ME]

Myllym<U+00E4>ki, M., Grabarnik, P., Seijo, H. and Stoyan. D. (2015). Deviation test construction and power comparison for marked spatial point patterns. Spatial Statistics 11: 19-34. doi: 10.1016/j.spasta.2014.11.004

Myllym<U+00E4>ki, M., Mrkvi<U+010D>ka, T., Grabarnik, P., Seijo, H. and Hahn, U. (2017). Global envelope tests for spatial point patterns. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79: 381<U+2013>404. doi: 10.1111/rssb.12172

Examples

Run this code

# NOT RUN {
if(requireNamespace("fda", quietly = TRUE)) {
  # Consider ordering of the girls in the Berkeley Growth Study data
  # available from the R package fda, see ?growth, according to their
  # annual heights or/and changes within years.
  # First create sets of curves (vectors), for raw heights and
  # for the differences within the years
  years <- paste(1:18)
  curves <- fda::growth[['hgtf']][years,]
  cset1 <- create_curve_set(list(r = as.numeric(years),
                                 obs = curves))
  plot(cset1, ylab="Height")
  cset2 <- create_curve_set(list(r = as.numeric(years[-1]),
                                 obs = curves[-1,] - curves[-nrow(curves),]))
  plot(cset2)

  # Order the girls from most extreme one to the least extreme one, below using the 'area' measure
  # a) according to their heights
  forder(cset1, measure = 'area')
  # Print the 10 most extreme girl indices
  order(forder(cset1, measure = 'area'))[1:10]
  # b) according to the changes (print indices)
  order(forder(cset2, measure = 'area'))[1:10]
  # c) simultaneously with respect to heights and changes (print indices)
  csets <- list(Height = cset1, Change = cset2)
  order(forder(csets, measure = 'area'))[1:10]
}
# }

Run the code above in your browser using DataLab