MBASEDVectorizedPropDiffTest: Vectorized wrapper around a test for difference of 2 proportions.

Description

Vectorized wrapper around a test for difference of 2 proportions.

Usage

MBASEDVectorizedPropDiffTest(countsMatSample1, totalsMatSample1, countsMatSample2, totalsMatSample2, probsMatSample1, probsMatSample2, rhosMatSample1, rhosMatSample2, alternative = "two.sided", checkArgs = FALSE)

Arguments

countsMatSample1,countsMatSample2

matrices of observed major allele counts in sample1 and sample2, respectively. Each row represents a specific genomic locus, while each column represents a set of observed major allele counts across loci (in practice, multiple columns represent different outcomes of count simulations).

totalsMatSample1,totalsMatSample2

matrices of total read counts across both alleles in sample1 and sample2, respectively. The interpretation of rows and columns is the same as for countsMatSample1.

probsMatSample1,probsMatSample2

matrices of underlying probabilites of observing the major allele in sample1 and sample2, respectively. The interpretation of rows and columns is the same as for countsMatSample1.

rhosMatSample1,rhosMatSample2

matrices of dispersion parameters of beta distributions for each locus in sample1 and sample2, respectively. The interpretation of rows and columns is the same as for countsMatSample1.

alternative

one of 'two.sided', 'greater', 'less'. DEFAULT: 'two.sided'

checkArgs

single boolean specifying whether arguments should be checked for adherence to specifications. DEFAULT: FALSE

Value

hetPval: a 1-row marix of heterogeneity P-values
hetQ: a 1-row matrix of heterogeneity statistics
TEFinal: a 1-row matrix of estimated proportion differences
seTEFinal: a 1-row matrix of estimated SEs of prop differences estimates
propDifferenceFinal: a 1-row matrix of estimated proportion differences
pValue: a 1-row matrix of corresponding p-values.
propDifferenceLoci: a matrix of same dimension as original input matrices giving estimated proportion differences on transformed scale at each individual locus.

Details

MBASEDVectorizedPropDiffTest implements meta-analysis-like apporoach using proportion differences at each locus as variables to be aggregated. Input matrices countsMatSample1, totalsMatSample1, countsMatSample2, totalsMatSample2, probsMatSample1, probsMatSample2, rhosMatSample1, and rhosMatSample2 have 1 column for each set of loci ('independent studies') to be combined, with each row corresponding to an individual locus. MBASEDVectorizedPropDiffTest uses meta analysis approach by transforming counts at each locus into proportions and combininig the proportion differences (between sample1 and sample2) using the inverse-variance weighted schema. The function reports proportion difference estimates, corresponding standard errors, z-values (based on expected value of 0 under the null hypothesis of overall difference of 0) , and corresponding p-values based on normal distribution assumption of z-values, where alternative hypothesis of 'two.sided', 'greater', and 'less' can be specified, with the latter two specified w.r.t. 0. Adjustment for pre-existing allelic bias is performed by taking observed proportion in each sample, transforming it with FT transformation, adjusting for allelic bias as in 1-sample case and back-transforming to get a shifted proportion. The shifted proportion is then used to estimate its variance. The function is used to cacluate p-values in ASE settings, where countsMatSample1 and countsMatSample2 represent major allele counts in sample1 and sample2, respectively, and totalsMatSample1 and totalsMatSample2 represent total allele counts. Matrices probsMatSample1 and probsMatSample2 capture the pre-existing allelic bias by supplying the underlying probabilities of observing alleles currently specified as major in absence of any allele-specific expression, and rhosMatSample1 and rhosMatSample2 provide values of dispersion parameter for beta-binomial counts (0, in case of binomial model) for individual loci within each sample.

Examples

Run this code

SNVCoverageTumor=sample(10:100,10) ## 2 genes with 5 loci each
SNVCoverageNormal=sample(10:100,10) ## 2 genes with 5 loci each
SNVAllele1CountsTumor=rbinom(length(SNVCoverageTumor), SNVCoverageTumor, 0.5)
SNVAllele1CountsNormal=rbinom(length(SNVCoverageNormal), SNVCoverageNormal, 0.5)
MBASED:::MBASEDVectorizedPropDiffTest(countsMatSample1=matrix(SNVAllele1CountsTumor, ncol=2), totalsMatSample1=matrix(SNVCoverageTumor, ncol=2), countsMatSample2=matrix(SNVAllele1CountsNormal, ncol=2), totalsMatSample2=matrix(SNVCoverageNormal, ncol=2), probsMatSample1=matrix(rep(0.5, length(SNVCoverageTumor)), ncol=2), probsMatSample2=matrix(rep(0.5, length(SNVCoverageNormal)), ncol=2), rhosMatSample1=matrix(rep(0, length(SNVCoverageTumor)), ncol=2), rhosMatSample2=matrix(rep(0, length(SNVCoverageNormal)), ncol=2), alternative='two.sided')

Run the code above in your browser using DataLab