Learn R Programming

DRIMSeq (version 1.0.2)

data_dmSQTLdata: Sample data for sQTL analysis

Description

A subset of data from GEUVADIS project where 462 RNA-Seq samples from lymphoblastoid cell lines were obtained. The genome sequencing data of the same individuals is provided by the 1000 Genomes Project. The samples in this project come from five populations: CEPH (CEU), Finns (FIN), British (GBR), Toscani (TSI) and Yoruba (YRI). Here, we use a subset of CEPH data from chromosome 19 available in GeuvadisTranscriptExpr package.

Usage

data_dmSQTLdata

Arguments

Value

data_dmSQTLdata

Format

data_dmSQTLdata is a dmSQTLdata object. See Examples.

Source

Lappalainen T, Sammeth M, Friedlander MR, et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature. 2013;501(7468):506-11 GeuvadisTranscriptExpr package

Examples

Run this code

#############################
### Create dmSQTLdata object
#############################

# Use subsets of data defined in GeuvadisTranscriptExpr package
library(GeuvadisTranscriptExpr)

counts <- GeuvadisTranscriptExpr::counts
genotypes <- GeuvadisTranscriptExpr::genotypes
gene_ranges <- GeuvadisTranscriptExpr::gene_ranges
snp_ranges <- GeuvadisTranscriptExpr::snp_ranges

# Make sure that samples in counts and genotypes are in the same order
sample_id <- colnames(counts[, -(1:2)])

d <- dmSQTLdataFromRanges(counts = counts[, sample_id], 
   gene_id = counts$Gene_Symbol, feature_id = counts$TargetID, 
   gene_ranges = gene_ranges, genotypes = genotypes[, sample_id], 
   snp_id = genotypes$snpId, snp_ranges = snp_ranges, sample_id = sample_id, 
   window = 5e3, BPPARAM = BiocParallel::SerialParam())

plotData(d)

data_dmSQTLdata <- d

#############################
### sQTL analysis
#############################
# If possible, use BPPARAM = BiocParallel::MulticoreParam() with more workers

d <- data_dmSQTLdata

head(names(d))
length(d)
d[1:10, ]
d[1:10, 1:10]

### Filtering
d <- dmFilter(d, min_samps_gene_expr = 70, min_samps_feature_expr = 5, 
   min_samps_feature_prop = 0, minor_allele_freq = 5, 
   BPPARAM = BiocParallel::SerialParam())
plotData(d)


### Calculate dispersion
d <- dmDispersion(d, BPPARAM = BiocParallel::SerialParam())
plotDispersion(d)

### Fit full model proportions
d <- dmFit(d, BPPARAM = BiocParallel::SerialParam())

### Fit null model proportions and test for sQTLs
d <- dmTest(d, BPPARAM = BiocParallel::SerialParam())
plotTest(d)

head(results(d))

### Plot feature proportions for top sQTL
res <- results(d)
res <- res[order(res$pvalue, decreasing = FALSE), ]

gene_id <- res$gene_id[1]
snp_id <- res$snp_id[1]

plotFit(d, gene_id, snp_id)
plotFit(d, gene_id, snp_id, plot_type = "boxplot2", order = FALSE)
plotFit(d, gene_id, snp_id, plot_type = "ribbonplot")

Run the code above in your browser using DataLab