Learn R Programming

GRAB (version 0.2.4)

GRAB.POLMM.Region: Instruction of POLMM-GENE method

Description

POLMM-GENE implements region-based association tests for ordinal categorical phenotypes, adjusting for sample relatedness. It is well-suited for analyzing rare variants in large-scale biobank data, and effectively controls type I error rates while maintaining statistical power.

Usage

GRAB.POLMM.Region()

Arguments

Details

For single-variant tests, see GRAB.POLMM.

See GRAB.POLMM for details on step 1.

Additional Control Parameters for GRAB.Region() with POLMM:

  • showInfo (logical, default: FALSE): Whether to print PCG iteration information for debugging.

  • tolPCG (numeric, default: 0.001): Tolerance for PCG in region testing.

  • maxiterPCG (integer, default: 100): Maximum PCG iterations in region testing.

Results are saved to four files:

  1. OutputFile: Region-based test results (SKAT-O, SKAT, Burden p-values).

  2. paste0(OutputFile, ".markerInfo"): Marker-level results for rare variants (MAC >= min_mac_region) included in region tests.

  3. paste0(OutputFile, ".otherMarkerInfo"): Information for excluded markers (ultra-rare variants or failed QC).

  4. paste0(OutputFile, ".infoBurdenNoWeight"): Summary statistics for burden tests without weights.

Region-level results (OutputFile) columns:

Region

Region identifier from GroupFile.

nMarkers

Number of rare variants with MAF < cutoff and MAC >= min_mac_region.

nMarkersURV

Number of ultra-rare variants with MAC < min_mac_region.

Anno.Type

Annotation type from GroupFile.

MaxMAF.Cutoff

Maximum MAF cutoff used for variant selection.

pval.SKATO

SKAT-O test p-value.

pval.SKAT

SKAT test p-value.

pval.Burden

Burden test p-value.

Marker-level results (paste0(OutputFile, ".markerInfo")) columns:

Region

Region identifier.

ID

Marker identifier.

Info

Marker information in format CHR:POS:REF:ALT.

Anno

Annotation from GroupFile.

AltFreq

Alternative allele frequency.

MAC

Minor allele count.

MAF

Minor allele frequency.

MissingRate

Proportion of missing genotypes.

IndicatorVec

Marker status indicator (1 = rare variant included, 3 = ultra-rare variant included).

StatVec

Score test statistic.

altBetaVec

Effect size estimate.

seBetaVec

Standard error of effect size estimate.

pval0Vec

Unadjusted p-value.

pval1Vec

SPA-adjusted p-value.

posRow

Position row index.

Other marker info (paste0(OutputFile, ".otherMarkerInfo")) columns:

ID

Marker identifier.

Annos

Annotation from GroupFile.

Region

Region identifier.

Info

Marker information in format CHR:POS:REF:ALT.

Anno

Annotation category.

AltFreq

Alternative allele frequency.

MAC

Minor allele count.

MAF

Minor allele frequency.

MissingRate

Proportion of missing genotypes.

IndicatorVec

Status indicator (0 or 2 for excluded markers).

Burden test summary (paste0(OutputFile, ".infoBurdenNoWeight")) columns:

region

Region identifier.

anno

Annotation type.

max_maf

Maximum MAF cutoff.

sum

Sum of genotypes.

Stat

Score test statistic.

beta

Effect size estimate.

se.beta

Standard error of effect size estimate.

pvalue

P-value for burden test.

References

Bi et al. (2023). Scalable mixed model methods for set-based association studies on large-scale categorical data analysis and its application to exome-sequencing data in UK Biobank. tools:::Rd_expr_doi("10.1016/j.ajhg.2023.03.010")

Examples

Run this code
GenoFileStep1 <- system.file("extdata", "simuPLINK.bed", package = "GRAB")
GenoFileStep2 <- system.file("extdata", "simuPLINK_RV.bed", package = "GRAB")
SparseGRMFile <- system.file("extdata", "SparseGRM.txt", package = "GRAB")
GroupFile <- system.file("extdata", "simuPLINK_RV.group", package = "GRAB")
OutputFile <- file.path(tempdir(), "resultPOLMMregion.txt")

PhenoFile <- system.file("extdata", "simuPHENO.txt", package = "GRAB")
PhenoData <- data.table::fread(PhenoFile, header = TRUE)
PhenoData$OrdinalPheno <- factor(PhenoData$OrdinalPheno, levels = c(0, 1, 2))
# Step 1
obj.POLMM <- GRAB.NullModel(
 OrdinalPheno ~ AGE + GENDER,
 data = PhenoData,
 subjIDcol = "IID",
 method = "POLMM",
 traitType = "ordinal",
 GenoFile = GenoFileStep1,
 SparseGRMFile = SparseGRMFile,
 control = list(tolTau = 0.2, tolBeta = 0.1)
)

# Step 2
GRAB.Region(obj.POLMM, GenoFileStep2, OutputFile,
  GroupFile = GroupFile,
  SparseGRMFile = SparseGRMFile,
  MaxMAFVec = "0.01,0.005"
)

head(data.table::fread(OutputFile))
head(data.table::fread(paste0(OutputFile, ".markerInfo")))
head(data.table::fread(paste0(OutputFile, ".otherMarkerInfo")))
head(data.table::fread(paste0(OutputFile, ".infoBurdenNoWeight")))

Run the code above in your browser using DataLab