Learn R Programming

HIBAG (version 1.8.3)

predict.hlaAttrBagClass: HIBAG model prediction (in parallel)

Description

To predict HLA type based on a HIBAG model (in parallel).

Usage

hlaPredict(object, snp, cl=NULL, type=c("response", "prob", "response+prob"), vote=c("prob", "majority"), allele.check=TRUE, match.type=c("RefSNP+Position", "RefSNP", "Position"), same.strand=FALSE, verbose=TRUE) "predict"(object, snp, cl, type=c("response", "prob", "response+prob"), vote=c("prob", "majority"), allele.check=TRUE, match.type=c("RefSNP+Position", "RefSNP", "Position"), same.strand=FALSE, verbose=TRUE, ...)

Arguments

object
a model of hlaAttrBagClass
snp
a genotypic object of hlaSNPGenoClass
cl
a cluster object, created by the package parallel or snow; if NULL is given, a uniprocessor implementation will be performed
type
"response": return the best-guess type plus its posterior probability; "prob": return all posterior probabilities; "response+prob": return the best-guess and all posterior probabilities
vote
"prob" (default behavior) -- make a prediction based on the averaged posterior probabilities from all individual classifiers; "majority" -- majority voting from all individual classifiers, where each classifier votes for an HLA type
allele.check
if TRUE, check and then switch allele pairs if needed
match.type
"RefSNP+Position" (by default) -- using both of RefSNP IDs and positions; "RefSNP" -- using RefSNP IDs only; "Position" -- using positions only
same.strand
TRUE assuming alleles are on the same strand (e.g., forward strand); otherwise, FALSE not assuming whether on the same strand or not
verbose
if TRUE, show information
...
further arguments passed to or from other methods

Value

Return a hlaAlleleClass object with posterior probabilities of predicted HLA types, or a matrix of pairwise possible HLA types with all posterior probabilities. If type = "response+prob", return a hlaAlleleClass object with a matrix of postprob for the probabilities of all pairs of alleles. If a probability matrix is returned, colnames is sample.id and rownames is an unordered pair of HLA alleles.

Details

If more than 50% of SNP predictors are missing, a warning will be given.

When match.type="RefSNP+Position", the matching of SNPs requires both RefSNP IDs and positions. A lower missing fraction maybe gained by matching RefSNP IDs or positions only. Call predict(..., match.type="RefSNP") or predict(..., match.type="Position") for this purpose. It might be safe to assume that the SNPs with the same positions on the same genome reference (e.g., hg19) are the same variant albeit the different RefSNP IDs. Any concern about SNP mismatching should be emailed to the genotyping platform provider.

See Also

hlaAttrBagging, hlaAllele, hlaCompareAllele, hlaParallelAttrBagging

Examples

Run this code
# make a "hlaAlleleClass" object
hla.id <- "A"
hla <- hlaAllele(HLA_Type_Table$sample.id,
    H1 = HLA_Type_Table[, paste(hla.id, ".1", sep="")],
    H2 = HLA_Type_Table[, paste(hla.id, ".2", sep="")],
    locus=hla.id, assembly="hg19")

# divide HLA types randomly
set.seed(100)
hlatab <- hlaSplitAllele(hla, train.prop=0.5)
names(hlatab)
# "training"   "validation"
summary(hlatab$training)
summary(hlatab$validation)

# SNP predictors within the flanking region on each side
region <- 500   # kb
snpid <- hlaFlankingSNP(HapMap_CEU_Geno$snp.id, HapMap_CEU_Geno$snp.position,
    hla.id, region*1000, assembly="hg19")
length(snpid)  # 275

# training and validation genotypes
train.geno <- hlaGenoSubset(HapMap_CEU_Geno,
    snp.sel=match(snpid, HapMap_CEU_Geno$snp.id),
    samp.sel=match(hlatab$training$value$sample.id,
    HapMap_CEU_Geno$sample.id))
test.geno <- hlaGenoSubset(HapMap_CEU_Geno,
    samp.sel=match(hlatab$validation$value$sample.id,
    HapMap_CEU_Geno$sample.id))

# train a HIBAG model
set.seed(100)
model <- hlaAttrBagging(hlatab$training, train.geno, nclassifier=4,
    verbose.detail=TRUE)
summary(model)

# validation
pred <- predict(model, test.geno)
# compare
(comp <- hlaCompareAllele(hlatab$validation, pred, allele.limit=model,
    call.threshold=0))
(comp <- hlaCompareAllele(hlatab$validation, pred, allele.limit=model,
    call.threshold=0.5))

Run the code above in your browser using DataLab