SEAA(Splicing Efficiency Analysis and Annotation)

SEAA provides convenient and rapid splicing efficiency calculation and splicing sites annotation function using next generation sequencing data. Aligned .bam files and a processed splicing sites .saf file are needed. The task can be finished in several minutes multi-core computing with the assitance of 'Rsubread'. Plots of splicing status type and Cumulative Distribution Function (CDF) and annotated vaild splicing efficiency can be exported.

Installation

You can install the released version of SEAA from github with:

install.packages("devtools")
library(devtools)
install_github("PrinceWang2018/SEAA")

You can also install the released version of SEAA from CRAN with:

install.packages("SEAA")

Data preparation

Aligning your sequencing files with hisat2 is strongly recommended. Genome_trans index of hisat2 can be downloaded from http://daehwankimlab.github.io/hisat2/download/. It should be noticed that ensembl version reference files with chromosome number "1" instead of "chr1" were used. Or you will need to edit the saf file manually.

hisat2 -x /indexpath/hisat2/grch37_tran/genome_tran -1 /fastqpath/NC_1.fastq -2 /fastqpath/NC_2.fastq --min-intronlen 20 --max-intronlen 10000 --threads 12 --rna-strandness F | samtools sort -o /outputpath/NC.bam -

We have already prepared the saf files consisting the splicing sites of human (hg19) and mouse (mm10) which can be downloaded from our github repository https://github.com/PrinceWang2018/SEAA_reference.

Example

This is a basic workflow which shows you how to use this software:

library("SEAA")
#set you work path
setwd("/home/username/workpath/")

Step1: Caculating splicing efficiency.

#Please locate your aligned bam files.
BAM_files <- c("./NC_1.bam","./shUSP39_1.bam")
#Please locate your .saf annotation files.
Anno_SAF<- "/home/wzx/project3tB/SEAA_project/reference/hg19/hg19_NCBI_splicing_sites_20210705.saf"
SEresultlist<-SEcalculation(BAM_files,Anno_SAF,paired = TRUE ,thread = 8,strand = 1)

Step2: Caculating splicing efficiency with featureCounts.

SEtyperesult<-SEtypeplot(SEresultlist,"horizontal")

Step3: Filtering invaild splicing efficiency and reads counts less than min_counts.

efficiency_5ss_3ss_nona_inf_reduct<-SEfilter(SEresultlist,min_counts = 5)

Step4: Cumulative Distribution Function plotting based on filtered splicing efficiency.

CDFplot(SEresultlist,efficiency_5ss_3ss_nona_inf_reduct,zoom.x= c(5,6))

Step5: Annotation of Splicing Sites Acquiring filtered Splicing Efficiency.

SEannotaionresult<-SEsiteanno(SEresultlist, efficiency_5ss_3ss_nona_inf_reduct, species = "hs")

Step6: Labeling the target gene in filtered splicing efficiency list.

targetlabeling(SEresultlist,target_site = "27830321",target_label = "RPL21",xlim.max = 1000, ylim.max = 1000)

Citation

Wang et al., (2021). SEAA: Splicing Efficiency Analysis and Annotation (to be published)

Contact

If you have any question, please contact Zixiang Wang at wangzixiang@mail.sdu.edu.cn or wangzixiang@live.com.

SEAA(Splicing Efficiency Analysis and Annotation)

Installation

Data preparation

Example

Citation

Contact

Copy Link

Version

Install

Monthly Downloads

Version

License

Maintainer

Last Published

Functions in SEAA (0.9.6)