eQTL: Performs an eQTL Analysis.

Description

This function performs an eQTL analysis.

Usage

eQTL(gex, geno, xAnnot, xSamples=NULL, genoSamples, windowSize=0.5, method="LM")

Arguments

gex

Matrix/Vector with expression values.

geno

Genotype data.

xAnnot

Location annotations for the expression values.

xSamples

Sample names for the expression values, see details.

genoSamples

Sample names for the genotype values, see details.

windowSize

Size of the window around the center gene.

method

Method of choice for the eQTL, see details.

Value

A list of class eqtl containing the values
gexThe gex object from the function call.
genoThe geno object from the function call.
xAnnotThe xAnnot object from the function call.
genoSamplesThe genoSamples object from the function call.
windowSizeThe windowSize object from the function call.
and an incapsulated list eqtl where each list item is a tested gene location and contains the items
ProbeLocUsed position of that gene. (Only different from 1 if multiple locations are considered.)
TestedSNPDetails about the considered SNPs.
p.valuesP values of the test.
GeneInfoDetails about the center gene.

Details

This function performs an eQTL analysis with different types of tests. The type of test can be specified with the method option. Possible options are "LM" and "directional". "LM" fits for each SNP within a predefined window of size windowSize (in MB) around a gene a linear model for the genotype information and the corresponding gene expression. The tested hypothesis is then if the slope is equal to zero or not.

The "directional" option applies a directional test based on probabilistic indices for triples as described in Fischer et al (2012). The test is applied for the two probabilistic indices $P_{0,1,2}$ and $P_{2,1,0}$ and we combine the two corresponding p-values $p_{012}=p_1$ and $p_{210}=p_2$ from previous tests then as overall p-value $min(2 min(p_1 , p_2 ), 1)$. We refer here to the different genotype groups as 0,1,2.

The gene expressions are given in gex. If several genes should be tested gex is a matrix and each column referes to a gene and each row to an individuum. The column names of this matrix should match then with the names used in the xAnnot object. Sample names can either be given as rownames in the matrix or as separate vector in xSamples. If only one gene should be tested then gex can be a vector.

The genotype information is provided in the geno object. Here one can either specify the file name of a ped/map file pair. In that case the function imports the genotype information using the SnpStats package. In case the genotype information has been imported already earlier using SnpStats::read.pedfile() the resulting SnpMatrix can also given as a parameter for geno.

The xAnnot object carries the annotations for the gene expressions. In case of multiple locations per gene it is of type list and each list item stores the information for one gene. Within the list item is then a data frame with the three columns Chr, Start, End and each row refers to one matching chromosomal postion of the underlying gene. Especially when probes of ssRNA are considered the chromosomal positions of a probe are not necessary unique. The names of the list xAnnot are the names of the genes and they have to match with the column names of gex. However, the order does not have to be the same, and xAnnot can include more genes than given in gex. The function finds and uses then the union between the column names of gex and the list entries of xAnnot. Alternative xAnnot can be a data frame if no multiple locations are considered. In that case xAnnot has to be a data frame with the four columns SNP, Chr, Start, End.

The option genoSamples is used in case that the sample names in the ped/map file (or SnpMatrix) do not match with rownames(gex) given in the expression matrix. The vector genoSamples is as long as the geno object has samples, but gives then for each individuum in geno the corresponding name in the gex object. The function finds then also the smallest union between the two data objects. If there are repeated measurements per individual for the genotypes we take by default only the first appearance in the data and neglect all successive values. Currently this cannot be changed. In case this behavior is not desired, the user has to remove the corresponding rows from geno.

References

Fischer, D., Oja, H., Sen, P.K., Schleutker, J., Wahlfors, T. (2012): Generalized Mann-Whitney Type Tests for Microarray Experiments, submitted article.

Fischer, D., Oja, H. (2012): A permutation type test for calculating generalized Mann-Whitney tests, submitted article.

Examples

Run this code

# Will be filled as soon as data is available (the corresponding article is currently
# under review). The data will then be published as a accompanying article.

# Import first the expression data
# geneMatrix <- 
# geneVector <- 

# Then import the genotype information in map/ped format
# genoData <- 

# Load the expression annotations
# multLoc <- 
# singleLoc <- 

# Optional the sample names
# genoSamples <- 
# expSamples <- 

# Set the window size in MB (here then 500.000 base pairs to BOTH sides:)
# windowSize <- 0.5

# Perform different eQTLS:

# eqtl1 <- eQTL(geneMatrix, genoData, multLoc, genoSamples, expSamples, windowSize)
# eqtl2 <- eQTL(geneMatrix, genoData, singleLoc, genoSamples, expSamples, windowSize)

Run the code above in your browser using DataLab