Learn R Programming

HLA Genotype Imputation with Attribute Bagging

Kernel Version: v1.3

GNU General Public License, GPLv3

Features

HIBAG is a state of the art software package for imputing HLA types using SNP data, and it relies on a training set of HLA and SNP genotypes. HIBAG can be used by researchers with published parameter estimates instead of requiring access to large training sample datasets. It combines the concepts of attribute bagging, an ensemble classifier method, with haplotype inference for SNPs and HLA types. Attribute bagging is a technique which improves the accuracy and stability of classifier ensembles using bootstrap aggregating and random variable selection.

Bioconductor Package

Release Version: 1.8.3

http://www.bioconductor.org/packages/release/bioc/html/HIBAG.html

Development Version: 1.9.3

http://www.bioconductor.org/packages/devel/bioc/html/HIBAG.html

Changes in Bioconductor Version (since v1.3.0):

  • optimize the calculation of hamming distance using SSE2 and hardware POPCNT instructions if available
  • hardware POPCNT: 2.4x speedup for large-scale data, compared to the implementation in v1.2.4
  • SSE2 popcount implementation without hardware POPCNT: 1.5x speedup for large-scale data, compared to the implementation in v1.2.4

Package Maintainer

Dr. Xiuwen Zheng (zhengx@u.washington.edu)

Pre-fit Model Download

http://www.biostat.washington.edu/~bsweir/HIBAG/

Citation

Zheng, X. et al. HIBAG-HLA genotype imputation with attribute bagging. Pharmacogenomics Journal 14, 192–200 (2014). http://dx.doi.org/10.1038/tpj.2013.18

Installation

  • Bioconductor repository:
source("http://bioconductor.org/biocLite.R")
biocLite("HIBAG")
  • Development version from Github:
library("devtools")
install_github("zhengxwen/HIBAG")

The install_github() approach requires that you build from source, i.e. make and compilers must be installed on your system -- see the R FAQ for your operating system; you may also need to install dependencies manually.

  • Install the package from the source code:

download the source code

wget --no-check-certificate https://github.com/zhengxwen/HIBAG/tarball/master -O HIBAG_latest.tar.gz
## or ##
curl -L https://github.com/zhengxwen/HIBAG/tarball/master/ -o HIBAG_latest.tar.gz

## Install ##
R CMD INSTALL HIBAG_latest.tar.gz
  • Install the package from the source code with the support of hardware POPCNT (requiring SSE4.2):

You have to customize the package compilation, see: CRAN: Customizing-package-compilation

Change ~/.R/Makevars to, if your machine supports SSE4.2 or higher, assuming GNU Compilers (gcc/g++) or Clang compiler (clang++) are installed:

## for C code
CFLAGS=-g -O3 -march=native -mtune=native
## for C++ code
CXXFLAGS=-g -O3 -march=native -mtune=native

Or force to create hardware POPCNT code:

## for C code
CFLAGS=-g -O3 -msse4.2 -mpopcnt
## for C++ code
CXXFLAGS=-g -O3 -msse4.2 -mpopcnt

If the package compilation succeeds with hardware POPCNT instructions, you should see a welcome message after loading the package:

HIBAG (HLA Genotype Imputation with Attribute Bagging)
Kernel Version: v1.3
Supported by Streaming SIMD Extensions (SSE4.2 + hardware POPCNT)

Archive

https://github.com/zhengxwen/Archive/tree/master/HIBAG

Copy Link

Version

Monthly Downloads

80

Version

1.8.3

License

GPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Xiuwen Zheng

Last Published

February 15th, 2017

Functions in HIBAG (1.8.3)

hlaAttrBagClass

The class of HIBAG model
hlaMakeSNPGeno

Make a SNP genotype object
hlaBED2Geno

Convert from PLINK BED format
HapMap_CEU_Geno

SNP genotypes of a study simulated from HapMap CEU genotypic data
hlaCheckSNPs

Check the SNP predictors in a HIBAG model
HLA_Type_Table

Four-digit HLA types of a study simulated from HapMap CEU
HIBAG-package

HLA Genotype Imputation with Attribute Bagging
hlaPublish

Finalize a HIBAG model
hlaAttrBagging

Build a HIBAG model
hlaAssocTest

Statistical Association Tests
hlaCompareAllele

Evaluate prediction accuracies
hlaGenoLD

Composite Linkage Disequilibrium
hlaOutOfBag

Out-of-bag estimation of overall accuracy, per-allele sensitivity, etc
hlaErrMsg

The last error message
print.hlaAttrBagClass

Summarize a ``hlaAttrBagClass'' or ``hlaAttrBagObj'' object.
hlaGenoMFreq

Minor Allele Frequency
hlaSampleAllele

Get sample IDs from HLA types with a filter
hlaUniqueAllele

Get unique HLA alleles
hlaAttrBagObj

The class of HIBAG object
hlaConvSequence

Conversion From HLA Alleles to Amino Acid Sequences
predict.hlaAttrBagClass

HIBAG model prediction (in parallel)
hlaGenoMRate

Missing Rates Per SNP
summary.hlaAlleleClass

Summarize a ``hlaAlleleClass'' or ``hlaAASeqClass'' object
hlaFlankingSNP

SNP IDs in Flanking Region
hlaGenoAFreq

Allele Frequency
hlaGenoCombine

Combine two genotypic data sets into one
hlaModelFiles

Load a model object from files
hlaAlleleDigit

Trim HLA alleles
hlaGenoSwitchStrand

Allele switching
hlaAASeqClass

Class of HLA Amino Acid Sequence Type
hlaCheckAllele

Check SNP alleles
hlaSplitAllele

Divide the samples randomly
hlaCombineAllele

Combine two datasets of HLA types
hlaGeno2PED

Convert to PLINK PED format
hlaLociInfo

HLA Locus Information
hlaClose

Dispose a model object
hlaParallelAttrBagging

Build a HIBAG model via parallel computation
hlaSubModelObj

Get a subset of individual classifiers
summary.hlaSNPGenoClass

Summarize a SNP dataset
hlaModelFromObj

Conversion between the in-memory model and the object that can be saved in a file
hlaSNPID

Get SNP IDs and positions
hlaAllele

A list of HLA types
hlaCombineModelObj

Combine two HIBAG models together
hlaGenoSubset

Get a subset of genotypes
hlaAlleleClass

Class of HLA Type
hlaAlleleSubset

Get a subset of HLA types
hlaSNPGenoClass

The class of SNP genotypes
hlaGenoMRate_Samp

Missing Rates Per Sample
hlaPredMerge

Merge prediction results from multiple HIBAG models
plot.hlaAttrBagObj

Plot a HIBAG model
hlaReport

Format a report
hlaGDS2Geno

Convert from SNP GDS format