aziz.test: Statistical test for heterogeneous effects

Description

Main function running the statistical test looking for heterogeneous effects/ aberration enrichment. Takes a vector of case/control labels (y) and a vector of numeric measurements (x) to be tested for association with case/control status. For example, in a clinical trial setting y can indicate individuals on a drug vs placebo and x can be a change in disease severity measurement from baseline. This test will return a p-value indicating drug efficacy and is more powerful than other test in a heterogeneous effects setting. Another usage example is in -Omics data where y would indicate disease vs healthy control and x could be a gene's expression vector across samples.

Usage

aziz.test(
  y,
  x,
  w = NULL,
  rep = 1e+05,
  doall = FALSE,
  eps = 1e-09,
  unidirectional = 0,
  flatten = 0.5,
  ignoremax = 0,
  normmethod = 1,
  novariance = F,
  conservative = T
)

Arguments

A binary vector of sample labels (cases=1, controls=0).

A numerical vector. Variable tested for association. Preferably continuous

Default = NULL. Optional numerical vector of weights. 1 means all weights are equal to 1 and only the ordering is considered. If NULL (default), a standardisation of x is used to calculate the weights giving larger weights to aberrations of larger magnitude.

rep

Default=100000. Number of permutations to be used to calculate p-values.

doall

Default=FALSE. Logical. If TRUE all rep permutations are performed. If FALSE only enough permutations are performed to get accurate p-values. Variable that are clearly not associated need only a 100 permutations.

eps

Default = 0.000000001. Small numeric value. Standard deviation of the gaussian node added to x before ordering samples. In the case of equalities, this ensures the ordering is not biased. Adjust lower if x has low variability.

unidirectional

Default = 0. Can be 0, 1 or -1. 0 is for testing both directions of effect. 1 is for testing cases<controls and -1 is for testing cases>controls.

flatten

Default = 0.5. Numeric value recommended between 0 and 1. If weights are not given, we take the max of flatten and the absolute value of the Z-score of x as the weights (Default behavior).

ignoremax

Default=0. Optional value indicating if we should ignore the first few values when selecting the maximal enrichment score. Alternatively, it can be viewed as the minimal size considered for the aberrant interval.

normmethod

Default=1. If w=NULL the weights are generated by subtracting the mean and dividing by the standard deviation. If normmethod=2, the median and MAD are used instead, for a better treatment of outliers.

novariance

Default=FALSE. aziz.test is able to detect a difference in variance between cases and controls as an association (when variance of cases is larger than the variance of controls). novariance=True changes the behaviour and penalizes scenarios with outliers going both ways in the cases. This will remove the associations that would usually be picked by a Levene test. Consider using this when using unidirectional testing if variance changes between groups are irrelevant in your considered problem. Results in loss of power.

conservative

Default=TRUE. p-values = b+1/ (1+ #permutations) is the returned value. As described in Phibson 2010: "Permutation p-values should never be zero"

Value

A result object with the following fields: (for clarity use print_summary)

es: Max enrichment score.
pval: Permutation p-value, if permutations were performed.
oddcas: Proportion of cases in the aberrant interval driving the max enrichment score. This is described as the proportion r in the main paper.
direction: direction of the effect. 1: cases<controls, 2: cases>controls.
oddratio: Odds ratio of being in the aberrant interval for cases/controls. Equal to oddcas divided by the same calculation on controls.

Other info fields (Can be useful ):

esm: Max enrichment score in both directions.
esind: Index of the Max enrichment score in both directions. can also be interpreted the number of samples in the aberrant interval.
ncas: Number of cases in the aberrant interval.
escurve: A vector of the computed standardized enrichment scores at all positions.
perm: A vector of all max enrichment scores obtained in permutations.

Examples

Run this code

# NOT RUN {
y = c(rep(1,200),rep(0,200))
x = rnorm(400)

res = aziz.test(y,x,rep=100) #run 100 permutations to calculate p-value
print_summary(res)

#Inducing an aberration enrichment signal by perturbing some of the cases
x[1:20]=x[1:20]-3;
res2 = aziz.test(y,x,rep=100)
print_summary(res2)
# }

Run the code above in your browser using DataLab