Learn R Programming

caRpools (version 0.83)

stats.data: Calculating data set statistics

Description

General statistics for a given dataset can be obtained by `stats.data`.

Usage

stats.data(dataset, namecolumn = 1, fullmatchcolumn = 2, extractpattern=expression("^(.+?)_.+"), readcount.unmapped.total = NA, controls.target = NULL, controls.nontarget = "random", type="stats")

Arguments

dataset
Data frame of read-count object. *Default* none *Values* data frame as created by `load.file()`
namecolumn
In which column are the sgRNA identifiers? *Default* 1 *Values* column number (numeric)
fullmatchcolumn
In which column are the read counts? *Default* 2 *Values* column number (numeric)
extractpattern
PERL regular expression that is used to retrieve the gene identifier from the overall sgRNA identifier. e.g. in **AAK1_107_0** it will extract **AAK1**, since this is the gene identifier beloning to this sgRNA identifier. **Please see: Read-Count Data Files** *Default* expression("^(.+?)(_.+)"), will work for most available libraries. *Values* PERL regular expression with parenthesis indicating the gene identifier (expression)
readcount.unmapped.total
Number of raw NGS reads, only used if `type="mapping`. *Default* NA *Values* Number of raw reads (integer)
controls.target
If `type="controls"`, this is the gene identifier of the positive control. *Default* NULL *Value* Gene Identifier (character)
controls.nontarget
If `type="controls"`, this is the gene identifier of the non-targeting control. *Default* "random" *Value* Gene Identifier (character)
type
Which type os statistic will be generated. *Default* "stats" *Values* "stats" will generate short statistics like median and mean for the data set, "mapping" will generate an overview of how many reads are present, "datatset" is used to generate in-depth statistics for each gene of a dataset, "controls" is used for in-depth statistics of the controls.

Value

Returns different tabular outputs.

Details

none

Examples

Run this code
data(caRpools)
U1.stats = stats.data(dataset=CONTROL1, namecolumn = 1, fullmatchcolumn = 2,
                      extractpattern=expression("^(.+?)_.+"), type="stats")

knitr::kable(stats.data(dataset=CONTROL1, namecolumn = 1, fullmatchcolumn = 2,
  extractpattern=expression("^(.+?)_.+"), readcount.unmapped.total = 1786217, type="mapping"))
  
knitr::kable(stats.data(dataset=CONTROL1, namecolumn = 1, fullmatchcolumn = 2,
  extractpattern=expression("^(.+?)_.+"), readcount.unmapped.total = 1786217,
  type="stats"))
  
knitr::kable(stats.data(dataset=CONTROL1, namecolumn = 1, fullmatchcolumn = 2,
  extractpattern=expression("^(.+?)_.+"), readcount.unmapped.total = 1786217,
  type="dataset")[1:10,1:5])

Run the code above in your browser using DataLab