Usage
runseq2gene(inputfile, search_radius=150000, promoter_radius=200, promoter_radius2=100, genome=c("hg38","hg19","mm10","mm9"), adjacent=FALSE, SNP=FALSE, PromoterStop=FALSE,NearestTwoDirection=TRUE,UTR3=FALSE)
Arguments
inputfile
An R object input file that records genomic region information (coordinates).
The file format could be data frame defined as:
- column 1
the unique IDs of genomic regions of interest (peaks, mutations, or SNPs)
- column 2
the chromosome IDs (eg. chr5 or 5)
- column 3
the start of genomic regions
- column 4
the end of genomic regions (for SNP and point mutations,
the difference of start and end is 1bp)
- column 5...
Other custom defined information (option)
Or, the input format should be RangedData object(from R package IRanges) with value column.
- column 1: space
the chromosome IDs (eg. chr5 or 5)
- column 2: ranges
the ranges of genomic regions
- column 3: name
the unique IDs of genomic regions of interest (peaks, mutations, or SNPs)
- more columns:
Other custom defined information (optional)
search_radius
A non-negative integer, with which the input genomic regions can be assigned
not only to the matched or nearest gene, but also with all genes
within a search radius for some genomic region type.
This parameter works only when the parameter "SNP" is FALSE. Default is 150000.
promoter_radius
A non-negative integer. Default is 200. Promoters are here defined
as upstream regions of the transcription start sites (TSS).
User can assign the promoter radius, a suggested value is between 200 to 2000.
promoter_radius2
A non-negative integer. Default is 100.
Promoters are here defined as downstream regions after the transcription start sites (TSS).
genome
A character specifies the genome type. Currently,
choice of "hg38", "hg19", "mm10", and "mm9" is supported.
adjacent
A Boolean. Default is FALSE to search all genes within the search_radius.
Using "TRUE" to find the adjacent genes only and ignore the parameters "SNP" and "search_radius".
SNP
A Boolean specifies the input object type. FALSE by default to keep on searching
for intron and neighboring genes. Otherwise, runseq2gene stops searching
when the input genomic region is residing on exon of a coding gene.
PromoterStop
A Boolean, "FALSE" by default to keep on searching neighboring genes using the parameter "search_radius".
Otherwise, runseq2gene stops searching neighboring genes. This parameter has function
only if an input genomic region maps to promoter of coding gene(s).
NearestTwoDirection
A boolean, "TRUE" by default to output the closest left and closest right coding genes with directions.
Otherwise, output only the nearest coding gene regardless of direction.
UTR3
A boolean, "FALSE" by defalt to calculate the distance from genes' 5UTR. Otherwsie, calculate the distance from genes' 3UTR.