Learn R Programming

⚠️There's a newer version (1.12.5) of this package.Take me there.

SeqArray: Big Data Management of Genome-Wide Sequence Variants

GNU General Public License, GPLv3

Features

Big data management of genome-wide sequence variants with thousands of individuals: genotypic data (e.g., SNP, indel and structural variation calls) and annotations in GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.

Bioconductor:

Release Version: v1.10.1

http://www.bioconductor.org/packages/release/bioc/html/SeqArray.html

Development Version: v1.11.0

http://www.bioconductor.org/packages/devel/bioc/html/SeqArray.html

Installation

  • Bioconductor repository:
source("http://bioconductor.org/biocLite.R")
biocLite("SeqArray")
  • Development version from Github:
library("devtools")
install_github("zhengxwen/gdsfmt")
install_github("zhengxwen/SeqArray")

The install_github() approach requires that you build from source, i.e. make and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.

  • Install the package from the source code:

gdsfmt, SeqArray

wget --no-check-certificate https://github.com/zhengxwen/gdsfmt/tarball/master -O gdsfmt_latest.tar.gz
wget --no-check-certificate https://github.com/zhengxwen/SeqArray/tarball/master -O SeqArray_latest.tar.gz
R CMD INSTALL gdsfmt_latest.tar.gz
R CMD INSTALL SeqArray_latest.tar.gz

## Or
curl -L https://github.com/zhengxwen/gdsfmt/tarball/master/ -o gdsfmt_latest.tar.gz
curl -L https://github.com/zhengxwen/SeqArray/tarball/master/ -o SeqArray_latest.tar.gz
R CMD INSTALL gdsfmt_latest.tar.gz
R CMD INSTALL SeqArray_latest.tar.gz

Copy Link

Version

Version

1.10.1

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Xiuwen Zheng

Last Published

February 15th, 2017

Functions in SeqArray (1.10.1)

seqAlleleCount

Get Allele Counts
seqGDS2SNP

Convert to a SNP GDS File
seqApply

Apply Functions Over Array Margins
seqExport

Export to a GDS File
seqExampleFileName

Example files
seqClose-methods

Close the SeqArray GDS File
seqDelete

Delete GDS Variables
seqAlleleFreq

Get Allele Frequencies
SeqArray-package

Big Data Management of Genome-wide Sequence Variants
seqBED2GDS

Convert PLINK BED Format to SeqArray Format
seqGetFilter

Get the Filter of GDS File
seqMissing

Missing genotype percentage
seqNumAllele

Number of alleles
seqOpen

Open a Sequence GDS File
seqGDS2VCF

Convert to a VCF File
seqParallel

Apply Functions in Parallel
seqParallelSetup

Setup a Parallel Environment
seqOptimize

Optimize the Storage of Data Array
seqMerge

Merge Multiple Sequence GDS Files
seqGetData

Get Data
seqVCF.SampID

Get the Sample IDs
seqStorage.Option

Storage and Compression Options for Importing VCF File(s)
seqVCF2GDS

Reformat VCF Files
seqTranspose

Transpose Data Array
seqSummary

Summarize the Sequence GDS File
seqSNP2GDS

Convert SNPRelate Format to SeqArray Format
SeqVarGDSClass

SeqVarGDSClass
seqVCF.Header

Parse the Header of a VCF File
seqSetFilter-methods

Set a Filter to Sample or Variant
seqSetFilterChrom

Chromosome Selection