Learn R Programming

MatrixEQTL (version 1.5.0)

Matrix_eQTL_main: Perform eQTL analysis.

Description

Matrix_eQTL_main function tests associations between every row of the snps and every row of the gene using either linear or ANOVA model, as defined by useModel. The testing procedure accounts for extra covariates in cvrt. To account for heteroskedastic and/or correlated errors, set the parameter errorCovariance to the error variance-covariance matrix. Associations significant at pvOutputThreshold are saved to output_file_name, with corresponding test statistics, p-values, and estimated false discovery rate. Matrix eQTL can perform separate analysis for local (cis) and distant (trans) eQTLs. A gene-SNP pair is considered local if the distance between them is less than cisDist. The genomic location of gene and SNPs is defined by variables snpspos and {genepos}. To perform eQTL analysis without regard to gene/SNP location, set pvOutputThreshold.cis = 0 (or do not set it). To perform eQTL analysis for local gene-SNP pairs only, set pvOutputThreshold = 0 and pvOutputThreshold.cis > 0. To perform eQTL analysis with separate treatment of local and distant eQTLs, set both thresholds to positive values. In this case the false discovery rate is calculated separately for these two groups of eQTLs. Function Matrix_eQTL_engine is a wrapper for Matrix_eQTL_main provided for compatibility with the previous versions of this package. There are three linear regression models currently supported by Matrix eQTL (as defined by the useModel parameter). Set useModel to modelLINEAR to assume the effect of the genotype to be additive linear. Use modelANOVA to treat genotype as a categorical variables and use ANOVA model. The new special code modelLINEAR_CROSS adds an new term to the model, equal to the product of genotype and the last covariate; the significance of this term is then tested using t-statistic. Extra code examples are provided on the pages for modelLINEAR, modelANOVA, and modelLINEAR_CROSS.

Usage

Matrix_eQTL_main(	
                   snps, 
                   gene, 
                   cvrt = SlicedData$new(), 
                   output_file_name = "", 
                   pvOutputThreshold = 1e-5,
                   useModel = modelLINEAR, 
                   errorCovariance = numeric(), 
                   verbose = TRUE, 
                   output_file_name.cis = "", 
                   pvOutputThreshold.cis = 0,
                   snpspos = NULL, 
                   genepos = NULL,
                   cisDist = 1e6,
                   pvalue.hist = FALSE)

Matrix_eQTL_engine(
                   snps, 
                   gene, 
                   cvrt = SlicedData$new(), 
                   output_file_name, 
                   pvOutputThreshold = 1e-5, 
                   useModel = modelLINEAR, 
                   errorCovariance = numeric(), 
                   verbose = TRUE,
                   pvalue.hist = FALSE)

Arguments

snps
SlicedData object with genotype information. Can be real-valued for linear model and should take up 2 or 3 distinct values for ANOVA (see useModel parameter).
gene
SlicedData object with gene expression information. Should have columns matching those of snps.
cvrt
SlicedData object with additional covariates. Can be an empty SlicedData object in case of no covariates.
output_file_name
character string with the name of the output file. Significant (all or distant) associations are saved to this file. Is the file with this name exists, it will be overwritten.
output_file_name.cis
character string with the name of the output file. Significant local associations are saved to this file. Is the file with this name exists, it will be overwritten.
pvOutputThreshold
numeric. Only gene-SNP pairs significant at this level will be saved in output_file_name.
pvOutputThreshold.cis
Same as pvOutputThreshold, but for cis-eQTLs. If not thresholds are positive, pvOutputThreshold determines cut-off for distant (trans) eQTLs.
useModel
numeric. Set it to modelLINEAR to model the effect of the genotype to be additive linear, or modelANOVA to treat genotype as a categorical variables and use ANOVA model. The special code modelLINEAR_CROSS adds an i
errorCovariance
numeric. The error covariance matrix, if not multiple of identity matrix. Use this parameter to account for heteroscedastic and/or correlated errors.
verbose
logical. Set to TRUE to display detailed report on the progress.
snpspos
data.frame with information about SNP locations, with 3 columns - SNP name, chromosome, and position.
genepos
data.frame with information about transcript locations, with 4 columns - the name, chromosome, and positions of the left and right ends.
cisDist
numeric. SNP-gene pairs within this distance are considered local. The distance is measured from the nearest end of the gene.
pvalue.hist
If pvalue.hist is not FALSE, the function returns information to plot the histogram(s) of (local/distant/all) p-values. Set pvalue.hist to a positive integer to build the histogram with pvalue.hist bin

Value

  • The detected eQTLs are saved in output_file_name and/or output_file_name.cis. The method also returns a list with a summary of the performed analysis.

Details

Note that the the columns of gene, snps, and cvrt must match. If they do not match in the input files, use ColumnSubsample method to subset and/or reorder them.

References

For more information visit: http://www.bios.unc.edu/research/genomic_software/Matrix_eQTL/

See Also

For more information on the class of the first three arguments see SlicedData.

Examples

Run this code
# Number of columns (samples)
n = 100;

# Genetate single genotype variable
snps.mat = rnorm(n);

# Generate single expression variable
gene.mat = 0.5*snps.mat + rnorm(n);

# Create 3 SlicedData for the analysis
snps1 = SlicedData$new( matrix( snps.mat, nrow = 1 ) );
gene1 = SlicedData$new( matrix( gene.mat, nrow = 1 ) );
cvrt1 = SlicedData$new();

# Call the main analysis function
me = Matrix_eQTL_main(
	snps = snps1, 
	gene = gene1, 
	cvrt = cvrt1, 
	'Output_temp.txt', 
	pvOutputThreshold = 1, 
	useModel = modelLINEAR, 
	errorCovariance = numeric(), 
	verbose = TRUE,
	pvalue.hist = TRUE );
# remove the output file
file.remove( 'Output_temp.txt' );

# Pull Matrix eQTL results - t-statistic and p-value
tstat = me$all$eqtls[ 1, 3 ];
pvalue = me$all$eqtls[ 1, 4 ];
rez = c( tstat = tstat, pvalue = pvalue)
# And compare to those from linear regression
{
	cat('Matrix eQTL: 
'); 
	print(rez);
	cat('R summary(lm()) output: 
')
	lmout = summary(lm(gene.mat ~ snps.mat))$coefficients[2, 3:4];
	print(lmout)
}

Run the code above in your browser using DataLab