Learn R Programming

ccdf (version 1.1.4)

CCDF: Function to compute (un)conditional cumulative distribution function (CDF), used by plot_CCDF function.

Description

Function to compute (un)conditional cumulative distribution function (CDF), used by plot_CCDF function.

Usage

CCDF(
  Y,
  X,
  Z = NULL,
  method = c("linear regression", "logistic regression", "RF"),
  fast = TRUE,
  space_y = FALSE,
  number_y = length(Y)
)

Arguments

Y

a numeric vector of size n containing the preprocessed expressions from n samples (or cells).

X

a data frame containing numeric or factor vector(s) of size n containing the variable(s) to be tested (the condition(s) to be tested).

Z

a data frame containing numeric or factor vector(s) of size n containing the covariate(s).

method

a character string indicating which method to use to compute the CCDF, either 'linear regression', 'logistic regression' and 'permutations' or 'RF' for Random Forests. Default is 'linear regression' since it is the method used in the test.

fast

a logical flag indicating whether the fast implementation of logistic regression should be used. Only if 'dist_permutations' is specified. Default is TRUE.

space_y

a logical flag indicating whether the y thresholds are spaced. When space_y is TRUE, a regular sequence between the minimum and the maximum of the observations is used. Default is FALSE.

number_y

an integer value indicating the number of y thresholds (and therefore the number of regressions) to perform the test. Default is length(Y).

Value

A list with the following elements:

  • cdf: a vector of the cumulative distribution function of a given gene.

  • ccdf: a vector of the conditional cumulative distribution function of a given gene, computed given X. Only if Z is NULL.

  • ccdf_nox: a vector of the conditional cumulative distribution function of a given gene, computed given Z only (i.e. X is ignored.). Only if Z is not NULL.

  • ccdf_x: a vector of the conditional cumulative distribution function of a given gene, computed given X and Z. Only if Z is not NULL.

  • y_sort: a vector of the sorted expression points at which the CDF and the CCDFs are calculated.

  • x_sort: a vector of the variables associated with y_sort.

  • z_sort: a vector of the covariates associated with y_sort. Only if Z is not NULL.

Examples

Run this code
# NOT RUN {
X <- as.factor(rbinom(n=100, size = 1, prob = 0.5))
Y <- ((X==1)*rnorm(n = 50,0,1)) + ((X==0)*rnorm(n = 50,0.5,1))
res <- CCDF(Y,data.frame(X=X),method="linear regression")
# }

Run the code above in your browser using DataLab