# CB2 v1.1

0

0th

Percentile

## CRISPR Pooled Screen Analysis using Beta-Binomial Test

Provides functions for hit gene identification and quantification of sgRNA (single-guided RNA) abundances for CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) pooled screen data analysis. Details are in Jeong et al. (2018) <doi:10.1101/309302> and Baggerly et al. (2003) <doi:10.1093/bioinformatics/btg173>.

## What is CB2

CB2(CRISPRBetaBinomial) is a new algorithm for analyzing CRISPR data based on beta-binomial distribution. We provide CB2 as a R package, and the interal algorithms of CB2 are also implemented in CRISPRCloud.

## How to install

Currently CB2 is now on CRAN, and you can install it using install.package function.

instll.package("CB2")


Installation Github version of CB2 can be done using the following lines of code in your R terminal.

install.packages("devtools")
devtools::install_github("hyunhwaj/CB2")


Alternatively, here is a one-liner command line for the installation.

Rscript -e "install.packages('devtools'); devtools::install_github('LiuzLab/CB2')"


## A simple example how to use CB2 in R

FASTA <- system.file("extdata", "toydata",
"small_sample.fasta",
package = "CB2")
df_design <- data.frame()
for(g in c("Low", "High", "Base")) {
for(i in 1:2) {
FASTQ <- system.file("extdata", "toydata",
sprintf("%s%d.fastq", g, i),
package = "CB2")
df_design <- rbind(df_design,
data.frame(
group = g,
sample_name = sprintf("%s%d", g, i),
fastq_path = FASTQ,
stringsAsFactors = F)
)
}
}

sgrna_count <- run_sgrna_quant(FASTA, df_design)
sgrna_stat <- run_estimation(sgrna_count\$count, df_design, "Base", "Low")
gene_stat <- measure_gene_stats(sgrna_stat)


## Functions in CB2

 Name Description measure_gene_stats A function to perform gene-level test using a sgRNA-level statistics. plot_PCA A function to plot the first two principal components of samples. run_estimation A function to perform a statistical test at a sgRNA-level plot_corr_heatmap A function to show a heatmap sgRNA-level corrleations of the NGS samples. plot_count_distribution A function to plot read count distribution. Sanson_CRISPRn_A375 A benchmark CRISPRn pooled screen data from Sanson et al. Evers_CRISPRn_RT112 A benchmark CRISPRn pooled screen data from Evers et al. plot_dotplot A function to visualize dot plots for a gene. quant A C++ function to quantify sgRNA abundance from NGS samples. run_sgrna_quant A function to run a sgRNA quantification algorithm from NGS sample calc_mappability A function to calculate the mappabilities of each NGS sample. fit_ab A C++ function to perform a parameter estimation for the sgRNA-level test. It will estimate two different parameters phat and vhat, and we assume input count data follows the beta-binomial distribution. Dr. Keith Baggerly initially implemented this code in Matlab, and it has been rewritten it in C++ for the speed-up. join_count_and_design A function to join a count table and a design table. get_CPM A function to normalize sgRNA read counts. No Results!