Learn R Programming

msigdbr: MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format

Overview

The msigdbr R package provides Molecular Signatures Database (MSigDB) gene sets typically used with the Gene Set Enrichment Analysis (GSEA) software:

  • in an R-friendly "tidy" format with one gene pair per row
  • for multiple frequently studied model organisms, such as mouse, rat, pig, zebrafish, fly, and yeast, in addition to the original human genes
  • as gene symbols as well as NCBI Entrez and Ensembl IDs
  • without accessing external resources requiring an active internet connection

Installation

The package can be installed from CRAN.

install.packages("msigdbr")

Usage

The package data can be accessed using the msigdbr() function, which returns a data frame of gene sets and their member genes. For example, you can retrieve mouse genes from the C2 (curated) CGP (chemical and genetic perturbations) gene sets.

library(msigdbr)
genesets <- msigdbr(species = "mouse", collection = "C2", subcollection = "CGP")

Check the documentation website for more information.

Copy Link

Version

Install

install.packages('msigdbr')

Monthly Downloads

17,341

Version

10.0.2

License

MIT + file LICENSE

Maintainer

Igor Dolgalev

Last Published

April 14th, 2025

Functions in msigdbr (10.0.2)

msigdbr_species

List the species available in the msigdbr package
msigdbr

Retrieve the gene sets data frame
msigdbr-package

msigdbr: MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format
msigdbr_check_data

Check that the data package is installed
msigdbr_collections

List the collections available in the msigdbr package