biomartr
Functional Annotation and Biological Data Retrieval with R
The biomartr
package aims to provide users with an easy to use framework to obtain genome, proteome, and CDS data, as well as an interface to BioMart to retrieve functional annotation. Furthermore, it is specifically designed to serve as additional module to
the myTAI and orthologr frameworks, allowing the highest degree of reproducibility in evolutionary transcriptomics research from data retrieval to data visualization.
Additionally, the biomartr
package allows users to retrieve entire NCBI databases using only one command (see Database Retrieval Vignette).
Installation
Before users can download and install biomartr
they need to install the following packages from Bioconductor:
# install Bioconductor base packages
source("http://bioconductor.org/biocLite.R")
biocLite()
# load the biomaRt package
source("http://bioconductor.org/biocLite.R")
biocLite("biomaRt")
# load the Biostrings package
source("http://bioconductor.org/biocLite.R")
biocLite("Biostrings")
Users might be asked during the installation process of Biostrings
and biomaRt
whether or not they would like to update all package dependencies of the corresponding packages.
Please type a
specifying that all package dependencies of the corresponding packages shall be updated. This is important for the sufficient functionality of biomartr
.
Now users can download biomartr
from CRAN :
# install biomartr 0.0.3 from CRAN
install.packages("biomartr",
repos = "https://cran.rstudio.com/",
dependencies = TRUE)
NEWS
The current status of the package as well as a detailed history of the functionality of each version of biomartr
can be found in the NEWS section.
Download Developer Version
The developer version of biomartr
might include more functionality than the stable version on CRAN.
On Unix Based Systems
Now you can use the devtools
package to install biomartr from GitHub.
# install.packages("devtools")
# install the current version of biomartr on your system
library(devtools)
install_github("HajkD/biomartr", build_vignettes = TRUE, dependencies = TRUE)
On Windows Systems
# On Windows, this won't work - see ?build_github_devtools
install_github("HajkD/biomartr", build_vignettes = TRUE, dependencies = TRUE)
# When working with Windows, first you need to install the
# R package: rtools -> install.packages("rtools")
# Afterwards you can install devtools -> install.packages("devtools")
# and then you can run:
devtools::install_github("HajkD/biomartr", build_vignettes = TRUE, dependencies = TRUE)
# and then call it from the library
library("biomartr", lib.loc = "C:/Program Files/R/R-3.1.1/library")
Troubleshooting on Windows Machines
- Install
biomartr
on a Win 8 laptop: solution ( Thanks to Andres Romanowski )
Tutorials
Getting Started with biomartr
:
- Introduction
- NCBI Database Retrieval
- Sequence and Database Retrieval
- Functional Annotation
- Phylotranscriptomics using myTAI, orthologr, and biomartr
BioMart Queries
biomart()
: Main function to query the BioMart databasegetMarts()
: Retrieve All Available BioMart DatabasesgetDatasets()
: Retrieve All Available Datasets for a BioMart DatabasegetAttributes()
: Retrieve All Available Attributes for a Specific DatasetgetFilters()
: Retrieve All Available Filters for a Specific DatasetorganismBM()
: Function for organism specific retrieval of available BioMart marts and datasetsorganismAttributes()
: Function for organism specific retrieval of available BioMart attributesorganismFilters()
: Function for organism specific retrieval of available BioMart filters
Biological Data Retrieval
Genome Retrieval
getGenome()
: Download a specific genome stored on the NCBI ftp:// serverlistGenomes()
: List all genomes available on the NCBI ftp:// serveris.genome.available()
: Check Genome AvailabilitygetProteome()
: Download a specific proteome stored on the NCBI ftp:// servergetCDS()
: Download a specific CDS file (genome) stored on the NCBI ftp:// server
Database Retrieval
listDatabases()
: Retrieve a List of Available Databases for Downloaddownload_database()
: Download a Database to Your Local Hard Drive
Performing Gene Ontology queries
Gene Ontology
getGO()
: Function to retrieve GO terms for a given set of genes
Discussions and Bug Reports
I would be very happy to learn more about potential improvements of the concepts and functions provided in this package.
Furthermore, in case you find some bugs or need additional (more flexible) functionality of parts of this package, please let me know:
For Bug Report: Please send me an issue.