Learn R Programming

GET (version 0.1-3)

central_region: Central region / Global envelope

Description

Provides central regions or global envelopes or confidence bands

Usage

central_region(curve_sets, type = "erl", coverage = 0.5,
  alternative = c("two.sided", "less", "greater"), probs = c((1 -
  coverage)/2, 1 - (1 - coverage)/2), quantile.type = 7,
  central = "median", nstep = 2, ...)

Arguments

curve_sets

A curve_set object or a list of curve_set objects.

type

The type of the global envelope with current options for 'rank', 'erl', 'cont', 'area', 'qdir', 'st' and 'unscaled'. See details.

coverage

A number between 0 and 1. The 100*coverage% central region will be calculated.

alternative

A character string specifying the alternative hypothesis. Must be one of the following: "two.sided" (default), "less" or "greater". The last two options only available for types 'rank', 'erl', 'cont' and 'area'.

probs

A two-element vector containing the lower and upper quantiles for the measure 'q' or 'qdir', in that order and on the interval [0, 1]. The default values are 0.025 and 0.975, suggested by Myllym<U+00E4>ki et al. (2015, 2017).

quantile.type

As type argument of quantile, how to calculate quantiles for 'q' or 'qdir'.

central

Either "mean" or "median". If the curve sets do not contain the component theo for the theoretical central function, then the central function (used for plotting only) is calculated either as the mean or median of functions provided in the curve sets.

nstep

1 or 2 for how to contruct a combined global envelope if list of curve sets is provided. 2 (default) for a two-step combining procedure, 1 for one-step.

...

Additional parameters to be passed to forder.

Value

Either an object of class "global_envelope" and "fv" (see fv.object), or an "combined_global_envelope" for the case that curve_sets is a list of objects. The objects can be printed and plotted directly.

The "global_envelope" object is essentially a data frame containing columns

  • r = the vector of values of the argument r at which the test was made

  • obs = the data function, if there is only one data function. Otherwise not existing.

  • lo = the lower envelope based on the simulated functions

  • hi = the upper envelope based on the simulated functions

  • central = If the curve_set (or envelope object) contains a component 'theo', then this function is used as the central curve and returned in this component. Otherwise, the central_curve is the mean of the test functions T_i(r), i=2, ..., s+1. Used for visualization only.

Additionally, the return value has attributes

  • method = The name of the envelope test used for plotting purposes ("Global envelope")

  • alternative = The alternative specified in the function call.

  • ties = As the argument ties.

  • k_alpha = The value of k corresponding to the 100(1-alpha)% global envelope.

  • k = The values of the chosen measure for all the functions. If there is only one observed function, then k[1] will give the value of the measure for this.

  • call = The call of the function.

and a punch of attributes for the "fv" object type, see fv. Attributes of an object res can be obtained using the function attr, e.g. attr(res, "k") for the values of the ordering measure.

The "combined_global_envelope" is a list of "global_envelope" objects corresponding to the components of curve_sets. The second level envelope on which the envelope construction is based on is saved in the attribute "level2_ge".

Details

Given a curve_set (see create_curve_set for how to create such an object) or an envelope object, the function central_region construcst a central region, i.e. a global envelope, from the given set of functions (or vectors). There are two options for the functions that the curve_set can contain:

  • If the component obs of the curve_set is a matrix, then it is assumed that all the functions are data/observed. In this case, the component sim_m of the curve_set (which can be then NULL) is ignored and the central region constructed from the functions given in obs.

  • If the component obs is a vector, then sim_m should be provided as well and it is assumed to contain simulated functions (obtained, e.g., from some model or by permutation). The central region is calculated from all the functions.

Thus the curve_set contains functions (or vectors) \(T_1(r),\dots,T_s(r)\). In the case of one observed function only, the data function is considered to be \(T_1(r)\).

Generally an envelope is a band bounded by the vectors (or functions) \(T_{low}\) and \(T_{hi}\). A \(100(1-\alpha)\)% or 100*coverage% global envelope is a set \((T_{low}, T_{hi})\) of envelope vectors such that the probability that \(T_i\) falls outside this envelope in any of the d points of the vector \(T_i\) is less or equal to \(\alpha\). The global envelopes can be constructed based on different measures that order the functions from the most extreme one to the least extreme one. We use such orderings of the functions for which we are able to construct global envelopes with intrinsic graphical interpretation.

The type of the global envelope can be chosen with the argument type and the options are given in the following. Further information about the measures, on which the global envelopes are based, can be found in forder.

  • 'rank': The global rank envelope proposed by Myllym<U+00E4>ki et al. (2017) based on the extreme rank defined as the minimum of pointwise ranks.

  • 'erl': The global rank envelope based on the extreme rank length (Myllym<U+00E4>ki et al.,2017, Mrkvi<U+010D>ka et al., 2018). This envelope is constructed as the convex hull of the functions which have extreme rank length measure that is larger or equal to the critical \(\alpha\) level of the extreme rank length measure.

  • 'cont': The global rank envelope based on the continuous rank (Hahn, 2015; Mrkvi<U+010D>ka et al., 2019) based on minimum of continuous pointwise ranks. It is contructed as the convex hull in a similar way as the 'erl' envelope.

  • 'area': The global rank envelope based on the area rank (Mrkvi<U+010D>ka et al., 2019) which is based on area between continuous pointwise ranks and minimum pointwise ranks for those argument (r) values for which pointwise ranks achieve the minimum (it is a combination of erl and cont). It is contructed as the convex hull in a similar way as the 'erl' and 'area' envelopes.

  • 'qdir': The directional quantile envelope based on the directional quantile maximum absolute deviation (MAD) test (Myllym<U+00E4>ki et al., 2017, 2015), which takes into account the unequal variances of the test function T(r) for different distances r and is also protected against asymmetry of distribution of T(r).

  • 'st': The studentised envelope based on the studentised MAD measure (Myllym<U+00E4>ki et al., 2017, 2015), which takes into account the unequal variances of the test function T(r) for different distances r.

  • 'unscaled': The unscaled envelope (Ripley, 1981), which leads to envelopes with constant width. It corresponds to the classical maximum deviation test without scaling. This test suffers from unequal variance of T(r) over the distances r and from the asymmetry of distribution of T(r). We recommend to use the other alternatives instead. This unscaled global envelope is provided for reference.

For each curve in the curve_set, both the data curve and the simulations, an above mention measure k is determined. The measure values \(k_1, k_2, ..., k_s\) are returned in the attribute 'k' (in a case of one observed curve only, k[1] is its value). Based on the chosen measure, the central region, i.e. the global envelope, is constructed on the chosen interval of argument values (the functions in the curve_set are assumed to be given on this interval only).

If a list of (suitable) objects are provided in the argument curve_sets, then by default (nstep = 2) the two-step combining procedure is used to construct the combined global envelope as described in Myllym<U+00E4>ki and Mrkvi<U+010D>ka (2019). If nstep = 1 and the lengths of the multivariate vectors in each component of the list are equal, then the one-step combining procedure is used where the functions are concatenated together into a one long vector.

References

Mrkvi<U+010D>ka, T., Hahn, U. and Myllym<U+00E4>ki, M. (2018). A one-way ANOVA test for functional data with graphical interpretation. arXiv:1612.03608 [stat.ME]

Mrkvi<U+010D>ka, T., Myllym<U+00E4>ki, M. and Narisetty, N. N. (2019) New methods for multiple testing in permutation inference for the general linear model. arXiv:1906.09004 [stat.ME]

Myllym<U+00E4>ki, M., Grabarnik, P., Seijo, H. and Stoyan. D. (2015). Deviation test construction and power comparison for marked spatial point patterns. Spatial Statistics 11: 19-34. doi: 10.1016/j.spasta.2014.11.004

Myllym<U+00E4>ki, M., Mrkvi<U+010D>ka, T., Grabarnik, P., Seijo, H. and Hahn, U. (2017). Global envelope tests for spatial point patterns. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79: 381<U+2013>404. doi: 10.1111/rssb.12172

Myllym<U+00E4>ki, M. and Mrkvi<U+010D>ka, T. Global envelopes in R.

Ripley, B.D. (1981). Spatial statistics. Wiley, New Jersey.

See Also

global_envelope_test

Examples

Run this code
# NOT RUN {
## A central region of a set of functions
#----------------------------------------
if(requireNamespace("fda", quietly = TRUE)) {
  curve_set <- create_curve_set(list(r=as.numeric(row.names(fda::growth$hgtf)),
                                     obs=fda::growth$hgtf))
  plot(curve_set, ylab="height")
  cr <- central_region(curve_set, coverage=0.50, type="erl")
  plot(cr, main="50% central region")
}

## Confidence bands for linear or polynomial regression
#------------------------------------------------------
# Simulate regression data according to the cubic model
# f(x) = 0.8x - 1.8x^2 + 1.05x^3 for x in [0,1]
par <- c(0,0.8,-1.8,1.05) # Parameters of the true polynomial model
res <- 100 # Resolution
x <- seq(0, 1, by=1/res); x2=x^2; x3=x^3;
f <- par[1] + par[2]*x + par[3]*x^2 + par[4]*x^3 # The true function
d <- f + rnorm(length(x), 0, 0.04) # Data
# Plot the true function and data
plot(f, type="l", ylim=range(d))
points(d)

# Estimate polynomial regression model
reg <- lm(d ~ x + x2 + x3)
ftheta <- reg$fitted.values
resid0 <- reg$residuals
s0 <- sd(resid0)

# Bootstrap regression
# }
# NOT RUN {
B <- 2000 # Number of bootstrap samples
# }
# NOT RUN {
ftheta1 <- array(0, c(B,length(x)))
s1 <- array(0,B)
for(i in 1:B) {
  u <- sample(resid0, size=length(resid0), replace=TRUE)
  reg1 <- lm((ftheta+u) ~ x + x2 + x3)
  ftheta1[i,] <- reg1$fitted.values
  s1[i] <- sd(reg1$residuals)
}

# Centering and scaling
meanftheta <- apply(ftheta1, 2, mean)
m <- array(0, c(B,length(x)))
for(i in 1:B) { m[i,] <- (ftheta1[i,]-meanftheta)/s1[i] }

# Central region computation
boot.cset <- create_curve_set(list(r=1:length(x), obs=t(m)))
cr <- central_region(boot.cset, coverage=0.95, type="erl")
CB.lo <- ftheta+s0*cr$lo
CB.hi <- ftheta+s0*cr$hi

# Plotting the result
plot(d, ylab="f(x)", xaxt="n", xlab="x", main="95% central region")
axis(1, at=(0:5)*20, labels=(0:5)/5)
lines(ftheta)
lines(CB.lo, lty=2)
lines(CB.hi, lty=2)
# }

Run the code above in your browser using DataLab