Learn R Programming

Emcdf (version 0.1.2)

emcdf: Computes multivariate empirical joint distribution

Description

This function computes empirical joint distribution (joint CDF) with single/ multi-thread.

Usage

emcdf(data, a)

Arguments

data

a numeric matrix stores data. Or an S4 object of class "emcdf_obj".

a

a numeric vector or matrix of parameters for CDF function.

Value

a numeric (vector) as value(s) of empirical joint CDF function.

Details

When data is a numeric matrix, this function computes joint empirical CDF with single thread. When data is an object of class "emcdf_obj", it computes with multi-thread. Parameter "a" must have equal length (or equal column number) as the column number of data. Both single-thread and multi-thread emcdf algorithms are faster than using the bulit-in function sum{base}. See example for simulation. Note that initializing threads and spliting data takes time though it's a one-time task. Thus for big data, big number of CDF computation, multi-thread is recommended. Yet for small data, small number of CDF computation, single thread is faster.

Examples

Run this code
# NOT RUN {
n = 10^6
set.seed(123)
x = rnorm(n)
y = rnorm(n)
z = rnorm(n)
data = cbind(x, y, z)
#The aim is to compute F(0.5,0.5,0.5) with three
#approaches and compare the performances.
#To avoid CPU noises, we repeat the computation 10 times.
#compute with R built-in function, sum()
sum_time = system.time({
  aws1 = c()
  for(i in 1:10)
    aws1[i] = sum(x <= 0.5& y <=0.5& z <=0.5)/n
})[3]

#compute with emcdf single-thread
a = matrix(rep(c(0.5, 0.5, 0.5), 10), 10, 3)
single_time = system.time({
   aws2 = emcdf(data, a)
})[3]

obj = initF(data, 4)
multi_time = system.time({
   aws3 = emcdf(obj, a)
})[3]
aws2 == aws1
aws3 == aws1
sum_time
single_time
multi_time

# }

Run the code above in your browser using DataLab