carpools.hit.scatter: Plot: Plotting Scatters for hit candidate genes for all provided sampled

Description

As described before, scatter plots can be generated for all datasets. `carpools.hit.scatter` serves as a wrapper for `carpools.read.count.vs` and allows faster plotting for individual candidate genes or all overlapping candidate genes. It generated a pairs plot with the representation of all provided samples and highlights the candidate gene.

Usage

carpools.hit.scatter(wilcox=NULL, deseq=NULL, mageck=NULL, dataset, dataset.names = NULL,
namecolumn=1, fullmatchcolumn=2, title="Read Count", xlab="Readcount Dataset1",
ylab="Readcount Dataset2", labelgenes=NULL, labelcolor="orange",
extractpattern=expression("^(.+?)_.+"),
plotline=TRUE, normalize=TRUE, norm.function=median, offsetplot=1.2,
center=FALSE, aggregated=FALSE, type="enriched",
cutoff.deseq = 0.001, cutoff.wilcox = 0.05,
cutoff.mageck = 0.05, cutoff.override=FALSE, cutoff.hits=NULL,
plot.genes="overlapping", pch=16, col = rgb(0, 0, 0, alpha = 0.65))

Arguments

wilcox

Data output from `stat.wilcox`. *Default* NULL *Values* Data output from `stat.wilcox`.

deseq

Data output from `stat.deseq`. *Default* NULL *Values* Data output from `stat.deseq`.

mageck

Data output from `stat.mageck`. *Default* NULL *Values* Data output from `stat.mageck`.

cutoff.deseq

P-Value threshold used to determine significance. *Default* 0.001 *Values* numeric

cutoff.wilcox

P-Value threshold used to determine significance. *Default* 0.001 *Values* numeric

cutoff.mageck

P-Value threshold used to determine significance. *Default* 0.001 *Values* numeric

dataset

A list of data frames of read-count data as created by load.file(). *Default* none *Values* A list of data frames

namecolumn

In which column are the sgRNA identifiers? *Default* 1 *Values* column number (numeric)

fullmatchcolumn

In which column are the read counts? *Default* 2 *Values* column number (numeric)

dataset.names

A list of names that must be according to the list of data sets given in *dataset*. *Default* NULL *Value* NULL or list of data names (list)

norm.function

The mathematical function to normalize data. By default, the median is used. *Default* median *Values* Any mathematical function of R (function)

extractpattern

PERL regular expression that is used to retrieve the gene identifier from the overall sgRNA identifier. e.g. in **AAK1_107_0** it will extract **AAK1**, since this is the gene identifier beloning to this sgRNA identifier. **Please see: Read-Count Data Files** *Default* expression("^(.+?)(_.+)"), will work for most available libraries. *Values* PERL regular expression with parenthesis indicating the gene identifier (expression)

cutoff.override

Shall the p-value threshold be ignored? If this is TRUE, the top percentage gene of `cutoff.hits` is used instead. *Default* FALSE *Values* TRUE, FALSE

cutoff.hits

The percentatge of top genes being used if `cutoff.override=TRUE`. *Default** NULL *Values* numeric

plot.genes

Defines what kind of data is used. By default, overlapping genes are highlighted in red color. *Default* "overlapping" *Values* "overlapping"

type

Defines whether all genes are plotted or only those being enriched or depleted. *Default* "all" *Values* "all", "enriched", "depleted"

labelgenes

For which gene shall the sgRNA effects being plotted? This expects a gene identifier or a vector of gene identifiers. *Default* NULL *Values* A gene identifier or vector of gene identifiers (character)

xlab

Label of X-Axis, only if `pairs=FALSE` *Default* "X-Axis" *Values* "Label of X-Axis" (character)

ylab

Label of Y-Axism only if `pairs=FALSE` *Default* "Y-Axis" *Values* "Label of Y-Axis" (character)

pch

The type of point used in the plot. See `?par()`. *Default* 16 *Values* Any number describing the point, e.g. 16 (numeric)

col

The color of the plotted data. Can be any R color or RGB object. See ?rgb() for further information. *Default* rgb(0, 0, 0, alpha = 0.65) *Values* Any R color name or RGB color object (character OR color object)

plotline

You can draw additional lines indicating a fold change of 0, 2, 4. *Default* TRUE *Values** TRUE, FALSE (boolean)

normalize

Whether you would like to normalize read-counts first. Recommended if not done already. *Default* TRUE *Values* TRUE, FALSE (boolean)

offsetplot

Offetplot is used to stretch the x- and y-axis for nicer graphs. This will extend plotting area by offsetplot. *Default* 1.2 (Plotting area is streched to 1.2 times) *Values* any number (numeric)

center

If you like you can center your data within the plot. *Default* FALSE *Values* TRUE, FALSE (boolean)

aggregated

If you want to highlight genes, set this to true if you provide already aggregated gene read count instead of sgRNA read counts. *Default* FALSE *Values* TRUE, FALSE (boolean)

labelcolor

Color to highlight genes stated in `labelgenes`. *Default* "organge" *Values* Any R color or RGB color object.

title

Title of the plot.

Value

Return generic plots. See ?plot and ?pairs.

Details

none

Examples

Run this code

data(caRpools)

data.wilcox = stat.wilcox(untreated.list = list(CONTROL1, CONTROL2),
  treated.list = list(TREAT1,TREAT2), namecolumn=1, fullmatchcolumn=2,
  normalize=TRUE, norm.fun=median, sorting=FALSE, controls="random",
  control.picks=NULL)
  
data.deseq = stat.DESeq(untreated.list = list(CONTROL1, CONTROL2),
  treated.list = list(TREAT1,TREAT2), namecolumn=1,
  fullmatchcolumn=2, extractpattern=expression("^(.+?)(_.+)"),
  sorting=FALSE, filename.deseq = "ANALYSIS-DESeq2-sgRNA.tab",
  fitType="parametric")
  
data.mageck = stat.mageck(untreated.list = list(CONTROL1, CONTROL2),
treated.list = list(TREAT1,TREAT2), namecolumn=1, fullmatchcolumn=2,
norm.fun="median", extractpattern=expression("^(.+?)(_.+)"),
mageckfolder=NULL, sort.criteria="neg", adjust.method="fdr",
filename = "TEST" , fdr.pval = 0.05)

#Single Gene
plothitsscatter.enriched = carpools.hit.scatter(wilcox=data.wilcox,
deseq=data.deseq, mageck=data.mageck, dataset=list(TREAT1, TREAT2, CONTROL1, CONTROL2),
dataset.names = c(d.TREAT1, d.TREAT2, d.CONTROL1, d.CONTROL2),
namecolumn=1, fullmatchcolumn=2, title="Title", labelgenes="CASP8",
labelcolor="orange", extractpattern=expression("^(.+?)(_.+)"),
normalize=TRUE, norm.function=median, offsetplot=1.2, center=FALSE,
aggregated=FALSE, type="enriched", cutoff.deseq = 0.001,
cutoff.wilcox = 0.05, cutoff.mageck = 0.05, cutoff.override=FALSE,
cutoff.hits=NULL,  pch=16)

#Overlapping candidate genes

plothitsscatter.enriched = carpools.hit.scatter(wilcox=data.wilcox,
deseq=data.deseq, mageck=data.mageck, dataset=list(TREAT1, TREAT2, CONTROL1, CONTROL2),
dataset.names = c(d.TREAT1, d.TREAT2, d.CONTROL1, d.CONTROL2), namecolumn=1,
fullmatchcolumn=2, title="Title", labelgenes=NULL, labelcolor="orange",
extractpattern=expression("^(.+?)(_.+)"), normalize=TRUE, norm.function=median,
offsetplot=1.2, center=FALSE, aggregated=FALSE, type="enriched",
cutoff.deseq = 0.001, cutoff.wilcox = 0.05, cutoff.mageck = 0.05,
cutoff.override=FALSE, cutoff.hits=NULL,  pch=16)

Run the code above in your browser using DataLab