Learn R Programming

testforDEP (version 0.1.0)

testforDEP: Test dependence for two data

Description

This function computes test statistic, p value, and confidence interval for dependence based on classic methods: Pearson, Kendall, Spearman, and modern methods: Vexler, Kallenberg, MIC, Hoeffding, CANOVA and Empirical Likelihood tests.

Usage

testforDEP(x = NA, y = NA, data = NA, test, p.opt = "MC", num.MC = 10000, BS.CI = 0, rm.na = FALSE, set.seed = FALSE)

Arguments

x
a numeric vector stores first variable.
y
numeric vector stores second variable.
data
(Optional) a data frame stores data to be tested.
test
a character indicating which test to implement.. Must be one of {"PEARSON", "KENDALL", "SPEARMAN", "VEXLER", "TS2", "V", "MIC", "HOEFFD", "CANOVA", "EL"}
p.opt
a character specifying p value to be obtained by distribution or by Monte Carlo simulation. Must be "dist", "MC" or "table".
num.MC
a numeric for number of Monte Carlo simulations.
BS.CI
a numeric specifying alpha for Bootstrap confidence interval. When equal 0, confidence interval won't be computed.
rm.na
a TRUE/ FALSE flag indicating whether remove missing data (NA) in input.
set.seed
a TRUE/ FALSE flag indicating whether set seed for Monte Carlo simulation and bootstrap sampling.

Value

an S4 object of class "testforDEP_result", having attributes: test statistics (TS), p value (p_value) and confidence interval (CI) if apply.

Details

Argument "x, y" and "data" are two different ways to input data. When x or y is missing, data will be taken as input; while x, y and data all exist leads to error. Argument data is a two-column numeric data frame. The order of columns does not affect results. Since modern test methods: "VEXLER", "TS2", "V", "MIC", "HOEFFD", "CANOVA" and "EL" have no continuous probability density function, argument p.opt = "dist" does not apply. For classic methods, when p.opt is "dist", argument num.MC will be ignored. p.opt = "table" use interpolation from pre stored simulated tables. Current version only supports "VEXLER", "MIC", "HOEFFD" and "EL" tests. For Vexler, MIC and EL, since computation is more time-consuming, a warning with estimated execution time will be returned when input size > 100. Input size <= 100="" is="" recommanded="" for="" monte="" carlo="" p-value.="" input="" size=""> 100 use table. num.MC should be a integer between 100 and 10,000 for acceptable computation times. NA in input is no acceptable. Set rm.na = TRUE to remove. More details see Pearson, Kendall, Spearman, Vexler, Kallenberg, MIC, Hoeffding, CANOVA, EL.

See Also

Technical report: http://sphhp.buffalo.edu/content/dam/sphhp/biostatistics/Documents/techreports/UB-Biostatistics-TR1601.pdf

Examples

Run this code
set.seed(123)
x = runif(100, 0, 1)
y = runif(100, 0, 1)

testforDEP(x, y, test = "SPEARMAN", p.opt = "MC",
           num.MC = 10000, BS.CI = 0, set.seed = TRUE)


#An object of class "testforDEP_result"
#Slot "TS":
#[1] 59.54311

#Slot "p_value":
#[1] 0.6735326

#Slot "CI":
#list()

Run the code above in your browser using DataLab