carpools.read.distribution: QC: Plot Readcount Distribution

Description

A distribution for NGS data readcount can be created by `carpools.read.distribution` to visualize how the data set is distributed. This allows to check for data skewness and to estimate the overall assay quality. For further details see `?carpools.read.distribution`.

Usage

carpools.read.distribution(dataset,namecolumn=1, fullmatchcolumn=2, breaks="",
title="Title", xlab="X-Axis", ylab="Y-Axis",statistics=TRUE,
col=rgb(0, 0, 0, alpha = 0.65), extractpattern=expression("^(.+?)_.+"),
plotgene=NULL, type="distribution", logscale=TRUE)

Arguments

dataset

Data frame of read-count data as created by load.file(). *Default* none *Values* A data frame

namecolumn

In which column are the sgRNA identifiers? *Default* 1 *Values* column number (numeric)

fullmatchcolumn

In which column are the read counts? *Default* 2 *Values* column number (numeric)

breaks

Histogramm breaks see `?hist`. By default, will be calculated according to the dataset length. *Default* NULL *Values* (numeric)

title

Main title of plot *Default* "Title" *Values* "The title you want" (character)

xlab

Label of X-Axis *Default* "X-Axis" *Values* "Label of X-Axis" (character)

ylab

Label of Y-Axis *Default* "Y-Axis" *Values* "Label of Y-Axis" (character)

statistics

Whether basic stattistics will be shown in the plot. *Default* TRUE *Values* TRUE, FALSE (boolean)

col

The color of the plotted data. Can be any R color or RGB object. See ?rgb() for further information. *Default* rgb(0, 0, 0, alpha = 0.65) *Values* Any R color name or RGB color object (character OR color object)

extractpattern

PERL regular expression that is used to retrieve the gene identifier from the overall sgRNA identifier. e.g. in **AAK1_107_0** it will extract **AAK1**, since this is the gene identifier beloning to this sgRNA identifier. **Please see: Read-Count Data Files** *Default* expression("^(.+?)(_.+)"), will work for most available libraries. *Values* PERL regular expression with parenthesis indicating the gene identifier (expression)

plotgene

You can only plot the read count distribution of sgRNAs belonging to a certain gene, which is given to the function via plotgene. *Default* NULL *Value* NULL or gene identifier (character)

type

You can plot either the read count distribution either as a normal histogram, or a box-and-whisker plot. *Default* "distribution" *Values* "distribution" to plot a histogram, or "whisker" to plot a whisker plot (character)

logscale

Indicates whether the read-count is plotted in a logarithmic scale. *Default* TRUE *Values* TRUE, FALSE (boolean)

Value

plot.read.distribution return a generic plot, that can be passed on to any device.

Details

none

Examples

Run this code

data(caRpools)

carpools.read.distribution(CONTROL1, fullmatchcolumn=2,breaks=200,
  title=d.CONTROL1, xlab="log2 Readcount", ylab="# sgRNAs",statistics=TRUE) 
  
carpools.read.distribution(CONTROL1, fullmatchcolumn=2,breaks=200,
  title=d.CONTROL1, xlab="log2 Readcount", ylab="# sgRNAs",statistics=TRUE,
  type="whisker")

Run the code above in your browser using DataLab