locus: Create locus object for plotting

Description

Creates object of class 'locus' for genomic locus plot similar to locuszoom.

Usage

locus(
  data = NULL,
  gene = NULL,
  xrange = NULL,
  seqname = NULL,
  flank = NULL,
  fix_window = NULL,
  ens_db,
  chrom = NULL,
  pos = NULL,
  p = NULL,
  yvar = NULL,
  labs = NULL,
  index_snp = NULL,
  LD = NULL,
  std_filter = TRUE
)

Value

Returns a list object of class 'locus' ready for plotting, containing:

seqname: chromosome value
xrange: vector of genomic position range
gene: gene name
ens_db: Ensembl or AnnotationHub database
ens_version: Ensembl database version
organism: Ensembl database organism
genome: Ensembl data genome build
chrom: column name in data containing chromosome information
pos: column name in data containing position
p: column name in data containing p-value
yvar: column name in data to be plotted on y axis as alternative to p
labs: column name in data containing SNP IDs
index_snp: id of the most significant SNP
data: the subset of GWAS data to be plotted
TX: dataframe of transcript annotations
EX: GRanges object of exon annotations

If data is NULL when locus() is called then gene track information alone is returned.

Arguments

data: Dataset (data.frame or data.table) to use for plot. We recommend that tibbles are converted to a normal data.frame. If unspecified or NULL, gene track information alone is returned.
gene: Optional character value specifying which gene to view. Either gene, or xrange plus seqname, or index_snp must be specified.
xrange: Optional vector of genomic position range for the x axis.
seqname: Optional, specifies which chromosome to plot.
flank: Single value or vector with 2 values for how much flanking region left and right of the gene to show. Defaults to 100kb.
fix_window: Optional alternative to flank, which allows users to specify a fixed genomic window centred on the specified gene. Both flank and fix_window cannot be specified simultaneously.
ens_db: Either a character string which specifies which Ensembl database package (version 86 and earlier for Homo sapiens) to query for gene and exon positions (see ensembldb Bioconductor package). Or an ensembldb object which can be obtained from the AnnotationHub database. See the vignette and the AnnotationHub Bioconductor package for how to create this object.
chrom: Determines which column in data contains chromosome information. If NULL tries to autodetect the column.
pos: Determines which column in data contains position information. If NULL tries to autodetect the column.
p: Determines which column in data contains SNP p-values. If NULL tries to autodetect the column.
yvar: Specifies column in data for plotting on the y axis as an alternative to specifying p-values. Both p and yvar cannot be specified simultaneously.
labs: Determines which column in data contains SNP rs IDs. If NULL tries to autodetect the column.
index_snp: Specifies the index SNP. If not specified, the SNP with the lowest P value is selected. Can be used to specify locus region instead of specifying gene, or seqname and xrange.
LD: Optional character value to specify which column in data contains LD information.
std_filter: Logical, whether standard filters on chromosomes 1-22, X & Y, and filtering of genes to only those whose transcript ids start with "ENS" are applied. For users with novel genome assemblies, this probably needs to be set to FALSE.

Details

This is an R version of locuszoom (http://locuszoom.org) for generating publication ready Manhattan plots of gene loci. It references Ensembl databases using the ensembldb Bioconductor package framework for annotating genes and exons in the locus.

Examples

Run this code

## Bioconductor package EnsDb.Hsapiens.v75 is needed for these examples
if(require(EnsDb.Hsapiens.v75)) {
data(SLE_gwas_sub)
loc <- locus(SLE_gwas_sub, gene = 'UBE2L3', flank = 1e5,
             ens_db = "EnsDb.Hsapiens.v75")
summary(loc)
locus_plot(loc)
loc2 <- locus(SLE_gwas_sub, gene = 'STAT4', flank = 1e5,
              ens_db = "EnsDb.Hsapiens.v75")
locus_plot(loc2)
}