two_sample_test: Two-Sample Permutation Test

Description

This function carries out an hypothesis test where the null hypothesis is that the two samples are ruled by the same underlying generative probability distribution against the alternative hypothesis that they are ruled by two separate generative probability distributions.

Usage

two_sample_test(
  x,
  y,
  statistic = stat_hotelling,
  B = 1000L,
  alternative = "right_tail",
  combine_with = "tippett",
  type = "exact",
  seed = NULL
)

Arguments

A list or matrix representing the 1st sample.

A list or matrix representing the 2nd sample.

statistic

A character vector specifying the chosen test statistic(s). These can be stat_hotelling or user-specified functions that define desired statistics. See the section User-supplied statistic function for more information on how these user-supplied functions should be structured for compatibility with the flipr framwork. Default is stat_hotelling.

The number of sampled permutation. Default is 1000L.

alternative

A string specifying whether the p-value is right-tailed, left-tailed or two-tailed. Choices are "right_tail", "left_tail" and "two_tail". Default is "right_tail". Obviously, if the test statistic used in argument statistic is positive, all alternatives will lead to the two-tailed p-value.

combine_with

A string specifying the combining function to be used to compute the single test statistic value from the set of p-value estimates obtained during the non-parametric combination testing procedure. Default is "tippett", which picks Tippett's function.

type

A string specifying if performing an exact test through the use of Phipson-Smyth estimate of the p-value or an approximate test through a Monte-Carlo estimate of the p-value. Default is "exact".

seed

An integer specifying the seed of the random generator useful for result reproducibility or method comparisons. Default is NULL.

Value

A list with three components: the value of the statistic for the original two samples, the p-value of the resulting permutation test and a numeric vector storing the values of the permuted statistics.

User-supplied statistic function

A user-specified function should have at least two arguments:

the first argument is data which should be a list of the n1 + n2 concatenated observations with the original n1 observations from the first sample on top and the original n2 observations from the second sample below;
the second argument is indices which should be an integer vector giving the indices in data that are considered to belong to the first sample.

See the stat_hotelling function for an example.

Examples

Run this code

# NOT RUN {
n <- 10L
mx <- 0
sigma <- 1

# Two different models for the two populations
x <- rnorm(n = n, mean = mx, sd = sigma)
delta <- 10
my <- mx + delta
y <- rnorm(n = n, mean = my, sd = sigma)
t1 <- two_sample_test(x, y)
t1$pvalue

# Same model for the two populations
x <- rnorm(n = n, mean = mx, sd = sigma)
delta <- 0
my <- mx + delta
y <- rnorm(n = n, mean = my, sd = sigma)
t2 <- two_sample_test(x, y)
t2$pvalue
# }

Run the code above in your browser using DataLab