PrepareAnnotationRefseq: prepare annotation for Refseq

Description

prepare the annotation for Refseq through UCSC table browser.

Usage

PrepareAnnotationRefseq(genome = "hg19", CDSfasta, pepfasta, annotation_path, dbsnp = NULL, transcript_ids = NULL, splice_matrix = FALSE, COSMIC = FALSE, ...)

Arguments

genome

specify the UCSC DB identifier (e.g. "hg19")

CDSfasta

path to the fasta file of coding sequence.

pepfasta

path to the fasta file of protein sequence, check 'introduction' for more detail.

annotation_path

specify a folder to store all the annotations.

dbsnp

specify a snp dataset to be used for the SNP annotation, default is NULL. (e.g. "snp135")

transcript_ids

optionally, only retrieve transcript annotation data for the specified set of transcript ids. Default is NULL.

splice_matrix

whether generate a known exon splice matrix from the annotation. this is not necessary if you don't want to analyse junction results, default is FALSE.

COSMIC

whether to download COSMIC data, default is FALSE.

...

additional arguments

Value

several .RData file containing annotations needed for further analysis.

Examples

Run this code

transcript_ids <- c("NM_001126112", "NM_033360", "NR_073499", "NM_004448",
        "NM_000179", "NR_029605", "NM_004333", "NM_001127511")
pepfasta <- system.file("extdata", "refseq_pro_seq.fasta",
            package="customProDB")
CDSfasta <- system.file("extdata", "refseq_coding_seq.fasta",
            package="customProDB")
annotation_path <- tempdir()
PrepareAnnotationRefseq(genome='hg19', CDSfasta, pepfasta, annotation_path,
            dbsnp=NULL, transcript_ids=transcript_ids,
            splice_matrix=FALSE, COSMIC=FALSE)

Run the code above in your browser using DataLab