Learn R Programming

customProDB (version 1.12.0)

PrepareAnnotationRefseq: prepare annotation for Refseq

Description

prepare the annotation for Refseq through UCSC table browser.

Usage

PrepareAnnotationRefseq(genome = "hg19", CDSfasta, pepfasta, annotation_path, dbsnp = NULL, transcript_ids = NULL, splice_matrix = FALSE, COSMIC = FALSE, ...)

Arguments

genome
specify the UCSC DB identifier (e.g. "hg19")
CDSfasta
path to the fasta file of coding sequence.
pepfasta
path to the fasta file of protein sequence, check 'introduction' for more detail.
annotation_path
specify a folder to store all the annotations.
dbsnp
specify a snp dataset to be used for the SNP annotation, default is NULL. (e.g. "snp135")
transcript_ids
optionally, only retrieve transcript annotation data for the specified set of transcript ids. Default is NULL.
splice_matrix
whether generate a known exon splice matrix from the annotation. this is not necessary if you don't want to analyse junction results, default is FALSE.
COSMIC
whether to download COSMIC data, default is FALSE.
...
additional arguments

Value

several .RData file containing annotations needed for further analysis.

Examples

Run this code
transcript_ids <- c("NM_001126112", "NM_033360", "NR_073499", "NM_004448",
        "NM_000179", "NR_029605", "NM_004333", "NM_001127511")
pepfasta <- system.file("extdata", "refseq_pro_seq.fasta",
            package="customProDB")
CDSfasta <- system.file("extdata", "refseq_coding_seq.fasta",
            package="customProDB")
annotation_path <- tempdir()
PrepareAnnotationRefseq(genome='hg19', CDSfasta, pepfasta, annotation_path,
            dbsnp=NULL, transcript_ids=transcript_ids,
            splice_matrix=FALSE, COSMIC=FALSE)

Run the code above in your browser using DataLab