Learn R Programming

vbdm (version 0.0.1)

vbdmR: fit a discrete mixture model (R implementation)

Description

Fits a discrete mixture model for rare variant association analysis. Uses an approximate variational Bayes coordinate ascent algorithm for a computationally efficient solution. This is the slow but well documented R implementation.

Usage

vbdmR(y, G, X=NULL, tol=1e-4, thres=0.05, scaling=TRUE, 
      hyper=c(2,2))

Arguments

y
A vector of continuous phenotypes.
G
A matrix of genotypes or variables of interest.
X
An optional matrix of covariates.
tol
The tolerance for convergence based on the change in the lower bound on the marginal log likelihood in the vbdm algorithm.
thres
If the matrix is of genotypes, then this specifies a minor allele frequency threshold. Variants with a MAF greater than this threshold are excluded from the analysis.
scaling
Whether or not to scale the genotypes to have mean 0 and variance 1.
hyper
The hyperparameters for the prior defined over the mixing probability parameter. The first hyperparameter is the alpha parameter, and the second is the beta parameter.

Value

  • yThe phenotype vector passed to vbdmR.
  • GThe genotype matrix passed to vbdmR. Note that any variables that were dropped will be dropped from this matrix.
  • XThe covariate matrix passed to vbdmR. Will include intercept term if it was added earlier.
  • keepA vector of indices of the kept variables in G (if any were excluded based on thres)
  • pvecThe vector of estimated posterior probabilities for each variable in G.
  • gammaA vector of additive covariate effect estimates.
  • thetaThe estimated effect of the variables in G.
  • sigmaThe estimated error variance.
  • probThe estimated mixing parameter.
  • lbThe lower bound of the marginal log likelihood.
  • lbnullThe lower bound of the marginal log likelihood under the null model.
  • lrtThe approximate likelihood ratio test based on the lower bounds.
  • p.valueA p-value computed based on lrt with the assumption that lrt~chi^2_1

Details

This function contains the much slower, but well documented R implementation of the vbdm algorithm. This function does not have all of the sanity checks that vbdm has, and should therefore only be used for diagnostic purposes.

See Also

vbdm, burdenPlot

Examples

Run this code
#generate some test data
library(vbdm)
set.seed(3)
n <- 1000
m <- 20
G <- matrix(rbinom(n*m,2,.01),n,m);
beta1 <- rbinom(m,1,.2)
y <- G%*%beta1+rnorm(n,0,1.3)

#compare implementations
res1 <- vbdm(y=y,G=G);
res2 <- vbdmR(y=y,G=G);
T5 <- summary(lm(y~rowSums(scale(G))))$coef[2,4];
cat('vbdm p-value:',res1$p.value,
  '<nvbdmR>p-value:',res2$p.value,
  '<nT5>p-value:',T5,'<n>')
#vbdm p-value: 0.001345869 
#vbdmR p-value: 0.001345869 
#T5 p-value: 0.9481797</n><keyword>vbdm</keyword>
<keyword>association</keyword>
<keyword>genetic</keyword>
<keyword>rare</keyword>
<keyword>variational</keyword></nT5></nvbdmR>

Run the code above in your browser using DataLab