Matrix_eQTL_engine
tests for association between every row of the snps
and every row of the gene
using either linear or ANOVA model, as defined by useModel
.
The testing procedure accounts for extra covariates in cvrt
.
To account for heteroskedastic and/or correlated errors,
set the parameter errorCovariance
to the error covariance matrix.
Associations significant at pvOutputThreshold
are saved to output_file_name
,
with corresponding test statistics, p-values, and estimates of false discovery rate.Matrix_eQTL_engine(snps,
gene,
cvrt = SlicedData$new(),
output_file_name,
pvOutputThreshold = 1e-05,
useModel = modelLINEAR,
errorCovariance = numeric(),
verbose = FALSE)
Matrix_eQTL_engine_cis(snps,
gene,
cvrt = SlicedData$new(),
output_file_name,
pvOutputThreshold_cis = 1e-3,
pvOutputThreshold_tra = 1e-6,
useModel = modelLINEAR,
errorCovariance = numeric(),
verbose=FALSE,
snpspos,
genepos,
cisDist = 1e6 )
SlicedData
object with genotype information.
Can be real-valued for linear model and
should take up 2 or 3 distinct values for ANOVA (see useModel
parameter).SlicedData
object with gene expression information.
Should have columns matching those of snps
.SlicedData
object with additional covariates.
Can be an empty SlicedData
object in case of no covariates.output_file_name
.pvOutputThreshold
, but for cis-eQTLs.pvOutputThreshold
, but for trans-eQTLs.modelLINEAR
to use the linear model
(additive effect of the SNP on the gene), or
to modelANOVA
to treat genotype as a categorical variables.TRUE
to display detailed report on the progress.gene
, snps
, and cvrt
must match.
If they do not match in the input files, use ColumnSubsample
method to subset and/or reorder them.SlicedData
.## Settings
# Linear model to use, modelANOVA or modelLINEAR
useModel = modelLINEAR; # modelANOVA or modelLINEAR
# Genotype file name
SNP_file_name = 'Sample_Data/SNP.txt';
# Gene expression file name
expression_file_name = 'Sample_Data/GE.txt';
# Covariates file name
# Set to character() for no covariates
covariates_file_name = 'Sample_Data/Covariates.txt';
# Output file name
output_file_name = 'Sample_Data/eQTL_results_R.txt';
# Only associations significant at this level will be output
pvOutputThreshold = 1e-2;
# Error covariance matrix
# Set to character() for identity.
errorCovariance = character();
# errorCovariance = read.table("Sample_Data/errorCovariance.txt");
## Load genotype data
snps = SlicedData$new();
snps$fileDelimiter = "t"; # the TAB character
snps$fileOmitCharacters = "NA"; # denote missing values;
snps$fileSkipRows = 1; # one row of column labels
snps$fileSkipColumns = 1; # one column of row labels
snps$fileSliceSize = 10000; # read file in pieces of 10,000 rows
snps$LoadFile(SNP_file_name);
## Load gene expression data
gene = SlicedData$new();
gene$fileDelimiter = "t"; # the TAB character
gene$fileOmitCharacters = "NA"; # denote missing values;
gene$fileSkipRows = 1; # one row of column labels
gene$fileSkipColumns = 1; # one column of row labels
gene$fileSliceSize = 10000; # read file in pieces of 10,000 rows
gene$LoadFile(expression_file_name);
## Load covariates
cvrt = SlicedData$new();
cvrt$fileDelimiter = "t"; # the TAB character
cvrt$fileOmitCharacters = "NA"; # denote missing values;
cvrt$fileSkipRows = 1; # one row of column labels
cvrt$fileSkipColumns = 1; # one column of row labels
cvrt$fileSliceSize = snps$nCols()+1; # read file in one piece
if(length(covariates_file_name)>0) {
cvrt$LoadFile(covariates_file_name);
}
## Run the analysis
Matrix_eQTL_engine(snps,
gene,
cvrt,
output_file_name,
pvOutputThreshold,
useModel,
errorCovariance,
verbose=TRUE);
Run the code above in your browser using DataLab