Learn R Programming

vbsr (version 0.0.5)

compute_KL: Compute an empirical Kullback Leibler (KL) divergence for an observed distribution of Z-statistics

Description

This function computes the KL divergence between an observed distribution of Z-statistics and the expected distribution, when truncating at a given percentile of the reference normal distribution.

Usage

compute_KL(Zmat,alpha,pval)

Arguments

Zmat
Matrix of Z-statistics outputted from vbsr, where columns are Z-statistics of covariates computed at different values of the penalty parameter l0_path, and rows are covariates in the model.
alpha
The inner percentile of the reference normal distribution to compare to, e.g. if alpha=0.99, the KL divergence will only be computed for the inner 99% quantile of the reference distribution. Allows for deviations in the tails of the distribu
pval
If marginal pre-screening was performed originally, the P-value threshold used for the marginal screening.

Value

  • kl_vecThis is the observed KL statistic computed along the specified path of l0_path.
  • min_klThis is the minimum value of observed KL statistic
  • mean_klRandom permutations are performed to determine the expected KL statistic given the number of covariates being tested, and the setting of alpha, pval. Useful for determining if the observed distribution is well approximated by a normal distribution for a given setting of l0_path based on the KL statistic.
  • se_klThe error in the KL statistics from the random permutations. Good for determining the range of KL values that is reasonable given the model fits.

Details

This function is a vbsr internal function that computes the KL divergence for the Z-statistic distribution output by vbsr if run on a grid of l0_path, and takes as input the inner quantile to compute the KL statistic with (alpha), and if there was already marginal pre-screening performed to remove the central part of the Z-statistic distribution (pval).

References

Logsdon, B.A., C.L. Carty, A.P. Reiner, J.Y. Dai, and C. Kooperberg (2012). A novel variational Bayes multiple locus Z-statistic for genome-wide association studies with Bayesian model averaging. Bioinformatics, Vol. 28(13), 1738-1744

See Also

vbsr

Examples

Run this code
n <- 100;
   m <- 500;
   ntrue <- 10;
   e <- rnorm(n);
   X <- matrix(rnorm(n*m),n,m);
   tbeta <- sample(1:m,ntrue);
   beta <- rep(0,m);
   beta[tbeta]<- rnorm(ntrue,0,.3);
   y <- X%*%beta;
   y <- y+e;
   res<- vbsr(y,X,family="normal",l0_path=seq(-15,-3,length.out=100),post=NULL);
   klRes <- compute_KL(res$z,0.01,1);

Run the code above in your browser using DataLab