biomartr
Biological Data Retrieval with R
The biomartr package is designed for life scientists and provides a powerful sequence retrieval and functional annotation framework for the R programming language that aims to facilitate reproducibility and large-scale handling of biological data.
In detail, biomartr aims to provide users with an easy to use framework to obtain genome, proteome, and CDS data, as well as an interface to the BioMart database to retrieve functional annotation for genomic loci.
Hence, the biomartr package is designed to achieve the highest degree of reproducible research from data retrieval to data visualization.
Additionally, the biomartr package allows users to retrieve entire NCBI databases using only one command (see Database Retrieval Vignette).
Tutorials
Getting Started with biomartr:
- Introduction
- NCBI Database Retrieval
- Sequence and Database Retrieval
- Functional Annotation
- Evolutionary Transcriptomics using myTAI, orthologr, and biomartr
Installation
Before users can download and install biomartr they need to install the following packages from Bioconductor:
# install Bioconductor base packages
source("http://bioconductor.org/biocLite.R")
biocLite()
# load the biomaRt package
source("http://bioconductor.org/biocLite.R")
biocLite("biomaRt")
# load the Biostrings package
source("http://bioconductor.org/biocLite.R")
biocLite("Biostrings")Users might be asked during the installation process of Biostrings and biomaRt whether or not they would like to update all package dependencies of the corresponding packages.
Please type a specifying that all package dependencies of the corresponding packages shall be updated. This is important for the sufficient functionality of biomartr.
Now users can download biomartr from CRAN :
# install biomartr 0.0.3 from CRAN
install.packages("biomartr",
repos = "https://cran.rstudio.com/",
dependencies = TRUE)NEWS
The current status of the package as well as a detailed history of the functionality of each version of biomartr can be found in the NEWS section.
Download Developer Version
The developer version of biomartr might include more functionality than the stable version on CRAN.
On Unix Based Systems
Now you can use the devtools package to install biomartr from GitHub.
# install.packages("devtools")
# install the current version of biomartr on your system
library(devtools)
install_github("HajkD/biomartr", build_vignettes = TRUE, dependencies = TRUE)
On Windows Systems
# On Windows, this won't work - see ?build_github_devtools
install_github("HajkD/biomartr", build_vignettes = TRUE, dependencies = TRUE)
# When working with Windows, first you need to install the
# R package: rtools -> install.packages("rtools")
# Afterwards you can install devtools -> install.packages("devtools")
# and then you can run:
devtools::install_github("HajkD/biomartr", build_vignettes = TRUE, dependencies = TRUE)
# and then call it from the library
library("biomartr", lib.loc = "C:/Program Files/R/R-3.1.1/library")Troubleshooting on Windows Machines
- Install
biomartron a Win 8 laptop: solution ( Thanks to Andres Romanowski )
BioMart Queries
biomart(): Main function to query the BioMart databasegetMarts(): Retrieve All Available BioMart DatabasesgetDatasets(): Retrieve All Available Datasets for a BioMart DatabasegetAttributes(): Retrieve All Available Attributes for a Specific DatasetgetFilters(): Retrieve All Available Filters for a Specific DatasetorganismBM(): Function for organism specific retrieval of available BioMart marts and datasetsorganismAttributes(): Function for organism specific retrieval of available BioMart attributesorganismFilters(): Function for organism specific retrieval of available BioMart filters
Biological Data Retrieval
Genome Retrieval
getGenome(): Download a specific genome stored on the NCBI ftp:// serverlistGenomes(): List all genomes available on the NCBI ftp:// serveris.genome.available(): Check Genome AvailabilitygetProteome(): Download a specific proteome stored on the NCBI ftp:// servergetCDS(): Download a specific CDS file (genome) stored on the NCBI ftp:// server
Database Retrieval
listDatabases(): Retrieve a List of Available NCBI Databases for Downloaddownload_database(): Download a NCBI Database to Your Local Hard Drive
Meta-Genome Retrieval
meta.retieval(): Perform Meta-Genome Retieval from NCBI
Performing Gene Ontology queries
Gene Ontology
getGO(): Function to retrieve GO terms for a given set of genes
Discussions and Bug Reports
I would be very happy to learn more about potential improvements of the concepts and functions provided in this package.
Furthermore, in case you find some bugs or need additional (more flexible) functionality of parts of this package, please let me know:
For Bug Report: Please send me an issue.