nebula.chisq.train: Nonparametric empirical Bayes classifier using latent annotations: chi-square test statistics; training

Description

Assumes that chi-square test statistics for each SNP are available from another study. Treats the true control and case minor allele frequencies and the chi-square non-centrality parameters as random triples from a bivariate prior distribution G, and estimates the optimal Bayesian classifier given G. Nonparametric maximum likelihood is used as a plug-in estimator for G.

Usage

nebula.chisq.train(pi0, pi1, n0, n1, T, d = 25, maxit = 200,
  tol = 1e-04, verbose = FALSE)

Arguments

pi0, pi1

p x 1 vectors of control and case minor allele frequencies, respectively; IMPORTANT: must be relative to the same allele in both cases and controls

n0, n1

number of controls and number of cases, respectively

p x 1 vector of chi-square test statistics

if a single number, G is estimated on a d x d x d grid; if a three-component vector (d0,d1,dt), G is estimated on a d0 x d1 x dt grid

maxit

maximum number of EM iterations

tol

error tolerance

verbose

TRUE to print the error attained by each EM iteration

Value

Pi0

grid points for estimating the distribution of the control minor allele frequencies

Pi1

grid points for estimating the distribution of the case minor allele frequencies

Lam

grid points for estimating the distribution of the non-centrality parameter

conditional density matrix for controls

conditional density matrix for cases

conditional density matrix for test statistics

estimated mixing probability mass function

proportion of cases

Examples

Run this code

# NOT RUN {
p <- 1000; ## number of snps
I <- rep(0,p); I[1:10] <- 1; ## which snps are causal
set.seed(1); pi0 <- runif(p,0.1,0.5); ## control minor allele frequencies
set.seed(1); ors <- runif(sum(I),-1,1); ## odds ratios
pi1 <- pi0;
pi1[I==1] <- expit(ors+logit(pi0[I==1]));
set.seed(1); lam <- rep(0,p); lam[I==1] <- rchisq(sum(I==1),1,25); ## ncps
## training data
n0 <- 100; ## number of controls
X0 <- t(replicate(n0,rbinom(p,2,pi0))); ## controls
n1 <- 50; ## number of cases
X1 <- t(replicate(n1,rbinom(p,2,pi1))); ## cases
T <- rchisq(p,1,lam); ## chi-square statistics
nebula <- nebula.chisq.train(colMeans(X0)/2,colMeans(X1)/2,n0,n1,T,d=c(20,25,30));
par(mfrow=c(1,3));
contour(nebula$Pi0,nebula$Pi1,apply(nebula$g,c(1,2),sum));
points(pi0,pi1);
contour(nebula$Pi0,nebula$Lam,apply(nebula$g,c(1,3),sum));
points(pi0,lam);
contour(nebula$Pi1,nebula$Lam,apply(nebula$g,c(2,3),sum));
points(pi1,lam);

# }

Run the code above in your browser using DataLab