Learn R Programming

MD2sample (version 1.0.0)

twosample_test: Multivariate Two-Sample Tests

Description

This function runs a number of two sample tests using Rcpp and parallel computing.

Usage

twosample_test(
  x,
  y,
  vals_x = NA,
  vals_y = NA,
  TS,
  TSextra,
  B = 5000,
  nbins = c(5, 5),
  minexpcount = 5,
  Ranges = matrix(c(-Inf, Inf, -Inf, Inf), 2, 2),
  DoTransform = TRUE,
  samplingmethod = "Binomial",
  rnull,
  SuppressMessages = FALSE,
  LargeSampleOnly = FALSE,
  maxProcessor,
  doMethods = "all"
)

Value

A list of two numeric vectors, the test statistics and the p values.

Arguments

x

Continuous data: either a matrix of numbers, or a list with two matrices called x and y. if it is a matrix Observations are in different rows. Discrete data: a vector of counts or a matrix with columns named vals_x, vals_y, x and y.

y

a matrix of numbers if data is continuous or a vector of counts if data is discrete.

vals_x

=NA, a vector of values for discrete random variables, or NA if data is continuous.

vals_y

=NA, a vector of values for discrete random variables, or NA if data is continuous.

TS

user supplied routine to calculate test statistics for new tests.

TSextra

(optional) additional info passed to TS, if necessary.

B

=5000, number of simulation runs for permutation test.

nbins

=c(5,5), for chi square tests (2D only).

minexpcount

=5, lowest required count for chi-square test (2D only).

Ranges

=matrix(c(-Inf, Inf, -Inf, Inf),2,2), a 2x2 matrix with lower and upper bounds (2D only).

DoTransform

=TRUE, should data be transformed to unit hypercube?

samplingmethod

="Binomial" for Binomial sampling or "independence" for independence sampling.

rnull

function to generate new data sets for simulation as an alternative to the permutation method.

SuppressMessages

=FALSE, should informative messages be printed?

LargeSampleOnly

=FALSE, should only methods with large sample theories be run?

maxProcessor

number of cores to use. If missing the number of physical cores-1 is used. If set to 1 no parallel processing is done.

doMethods

="all", Which methods should be included?

Details

For details consult vignette("MD2sample","MD2sample")

Examples

Run this code
#Two continuous data sets from a multivariate normal:
x = mvtnorm::rmvnorm(100, c(0,0))
y = mvtnorm::rmvnorm(120, c(0,0))
twosample_test(x, y, B=100, maxProcessor=1)
#Using a new test, this one is an (included) chi square test. 
#Also enter data as a list:
TSextra=list(which="statistics", nbins=rbind(c(3,3), c(4,4)))
dta=list(x=x, y=y)
twosample_test(dta, TS=chiTS.cont, TSextra=TSextra, B=100, maxProcessor=1)
#Two discrete data sets from some distribution:
x = table(sample(1:4, size=1000, replace = TRUE))
y = table(sample(1:4, size=1000, replace = TRUE, prob=c(1,2,1,1)))
vals_x=rep(1:2,2)
vals_y=rep(1:2, each=2)
twosample_test(x, y, vals_x, vals_y, B=100, maxProcessor=1)
#Run a discrete chi square test and enter the data as a matrix:
TSextra=list(which="statistics")
dta=cbind(x=x, y=y, vals_x=vals_x, vals_y=vals_y)
twosample_test(dta, TS=chiTS.disc, TSextra=TSextra, B=100, maxProcessor=1)

Run the code above in your browser using DataLab