rtracklayer (version 1.32.1)

readGFF: Reads a file in GFF format

Description

Reads a file in GFF format and creates a data frame or DataFrame object from it.

Usage

readGFF(filepath, version=0, columns=NULL, tags=NULL, filter=NULL, nrows=-1, raw_data=FALSE)
GFFcolnames(GFF1=FALSE)

Arguments

filepath
A single string containing the path or URL to the file to read. Alternatively can be a connection.
version
readGFF should do a pretty descent job at detecting the GFF version. Use this argument only if it doesn't or if you want to force it to parse and import the file as if its 9-th column was in a different format than what it really is (e.g. specify version=1 on a GTF or GFF3 file to interpret its 9-th column as the "group" column of a GFF1 file). Supported versions are 1, 2, and 3.
columns
The standard GFF columns to load. All of them are loaded by default.
tags
The tags to load. All of them are loaded by default.
filter
nrows
-1 or the maximum number of rows to read in (after filtering).
raw_data
GFF1

Value

Details

See Also

Examples

Run this code
## Standard GFF columns.
GFFcolnames()
GFFcolnames(GFF1=TRUE)  # "group" instead of "attributes"

tests_dir <- system.file("tests", package="rtracklayer")
test_gff3 <- file.path(tests_dir, "genes.gff3")

## Load everything.
df0 <- readGFF(test_gff3)
head(df0)

## Load some tags only (in addition to the standard GFF columns).
my_tags <- c("ID", "Parent", "Name", "Dbxref", "geneID")
df1 <- readGFF(test_gff3, tags=my_tags)
head(df1)

## Load no tags (in that case, the "attributes" standard column
## is loaded).
df2 <- readGFF(test_gff3, tags=character(0))
head(df2)

## Load some standard GFF columns only (in addition to all tags).
my_columns <- c("seqid", "start", "end", "strand", "type")
df3 <- readGFF(test_gff3, columns=my_columns)
df3
table(df3$seqid, df3$type)
makeGRangesFromDataFrame(df3, keep.extra.columns=TRUE)

## Combine use of 'columns' and 'tags' arguments.
readGFF(test_gff3, columns=my_columns, tags=c("ID", "Parent", "Name"))
readGFF(test_gff3, columns=my_columns, tags=character(0))

## Use the 'filter' argument to load only features of type "gene"
## or "mRNA" located on chr10.
my_filter <- list(type=c("gene", "mRNA"), seqid="chr10")
readGFF(test_gff3, filter=my_filter)
readGFF(test_gff3, columns=my_columns, tags=character(0), filter=my_filter)

Run the code above in your browser using DataLab