Learn R Programming

customProDB (version 1.12.0)

easyRun_mul: An integrated function to generate consensus protein database from multiple samples

Description

Generate consensus protein database for multiple samples in a single function.

Usage

easyRun_mul(bamFile_path, RPKM_mtx = NULL, vcfFile_path, annotation_path, rpkm_cutoff, share_num = 2, var_shar_num = 2, outfile_path, outfile_name, INDEL = FALSE, lablersid = FALSE, COSMIC = FALSE, nov_junction = FALSE, bedFile_path = NULL, genome = NULL, junc_shar_num = 2, ...)

Arguments

bamFile_path
The path of BAM files
RPKM_mtx
Alternative to bamFile_path,default NULL, a matrix containing expression level for proteins in each sample. (e.g. FPKMs from cufflinks)
vcfFile_path
The path of VCF files
annotation_path
The path of already saved annotation, which will be used in the function
rpkm_cutoff
Cutoffs of RPKM values. see 'cutoff' in function OutputsharedPro for more information
share_num
The minimum share sample numbers for proteins which pass the cutoff.
var_shar_num
Minimum sample number of recurrent variations.
outfile_path
The path of output FASTA file
outfile_name
The name prefix of output FASTA file
INDEL
If the vcfFile contains the short insertion/deletion. Default is FALSE.
lablersid
If includes the dbSNP rsid in the header of each sequence, default is FALSE. Users should provide dbSNP information when running function Positionincoding() if put TRUE here.
COSMIC
If output the cosmic ids in the variation table.Default is FALSE. If choose TRUE, there must have cosmic.RData in the annotation folder.
nov_junction
If output the peptides that cover novel junction into the database. if TRUE, there should be splicemax.RData in the annotation folder.
bedFile_path
The path of BED files which contains the splice junctions identified in RNA-Seq.
genome
A BSgenome object(e.g. Hsapiens). Default is NULL. Required if nov_junction==TRUE.
junc_shar_num
Minimum sample number of recurrent splicing junctions.
...
Additional arguments

Value

A table file contains detailed variation information and several FASTA files.

Details

The function give a more convenient way for proteinomics researchers to generate customized database of multiple samples.

Examples

Run this code
bampath <- system.file("extdata/bams", package="customProDB")
vcfFile_path <- system.file("extdata/vcfs", package="customProDB")
annotation_path <- system.file("extdata/refseq", package="customProDB")
outfile_path <- tempdir()
outfile_name <- 'mult'

easyRun_mul(bampath, RPKM_mtx=NULL, vcfFile_path, annotation_path, rpkm_cutoff=1,
            share_num=2, var_shar_num=2, outfile_path, outfile_name, INDEL=TRUE,
            lablersid=TRUE, COSMIC=TRUE, nov_junction=FALSE)

Run the code above in your browser using DataLab