SeqArray: Big Data Management of Genome-Wide Sequence Variants
GNU General Public License, GPLv3
Features
Big data management of genome-wide sequence variants with thousands of individuals: genotypic data (e.g., SNP, indel and structural variation calls) and annotations in GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.
Bioconductor:
Release Version: v1.10.1
http://www.bioconductor.org/packages/release/bioc/html/SeqArray.html
- Tutorials: Data Management, Data Analytics
Development Version: v1.11.0
http://www.bioconductor.org/packages/devel/bioc/html/SeqArray.html
- Tutorials: Data Management, Data Analytics
Installation
- Bioconductor repository:
source("http://bioconductor.org/biocLite.R")
biocLite("SeqArray")
- Development version from Github:
library("devtools")
install_github("zhengxwen/gdsfmt")
install_github("zhengxwen/SeqArray")
The install_github()
approach requires that you build from source, i.e. make
and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.
- Install the package from the source code:
wget --no-check-certificate https://github.com/zhengxwen/gdsfmt/tarball/master -O gdsfmt_latest.tar.gz
wget --no-check-certificate https://github.com/zhengxwen/SeqArray/tarball/master -O SeqArray_latest.tar.gz
R CMD INSTALL gdsfmt_latest.tar.gz
R CMD INSTALL SeqArray_latest.tar.gz
## Or
curl -L https://github.com/zhengxwen/gdsfmt/tarball/master/ -o gdsfmt_latest.tar.gz
curl -L https://github.com/zhengxwen/SeqArray/tarball/master/ -o SeqArray_latest.tar.gz
R CMD INSTALL gdsfmt_latest.tar.gz
R CMD INSTALL SeqArray_latest.tar.gz