Learn R Programming

GRAB (version 0.2.3)

GRAB.Region: Conduct region-level genetic association testing

Description

Test for association between phenotype of interest and regions including multiple genetic marker (mostly low-frequency or rare variants).

Usage

GRAB.Region(
  objNull,
  GenoFile,
  GenoFileIndex = NULL,
  OutputFile,
  OutputFileIndex = NULL,
  GroupFile,
  SparseGRMFile = NULL,
  SampleFile = NULL,
  MaxMAFVec = "0.01,0.001,0.0005",
  annoVec = "lof,lof:missense,lof:missense:synonymous",
  chrom = "LOCO=F",
  control = NULL
)

Value

Region-based analysis results are saved into two files: OutputFile and OutputMarkerFile = paste0(OutputFile, ".markerInfo").

The file of OutputMarkerFile is the same as the results of GRAB.Marker. The file of OutputFile includes columns as below.

Region

Region IDs from RegionFile

Anno.Type

Annotation type from RegionFile

maxMAF

the maximal cutoff of the MAF to select low-frequency/rare variants into analysis.

nSamples

Number of samples in analysis.

nMarkers

Number of markers whose MAF < control$MaxMAFCutoff and MAC > control$MinMACCutoff. Markers with annotation value <= 0 will be excluded from analysis.

nMarkersURV

Number of Ultra-Rare Variants (URV) whose MAC < control$MinMACCutoff. Markers with annotation value <= 0 will be excluded from analysis.

pval.SKATO

p-values based on SKAT-O method

pval.SKAT

p-values based on SKAT method

pval.Burden

p-values based on Burden test

Arguments

objNull

the output object of function GRAB.NullModel.

GenoFile

a character of genotype file. Currently, two types of genotype formats are supported: PLINK and BGEN. Check GRAB.ReadGeno for more details.

GenoFileIndex

additional index files corresponding to the GenoFile. If NULL (default), the prefix is the same as GenoFile. Check GRAB.ReadGeno for more details.

OutputFile

a character of output file to save the analysis results.

OutputFileIndex

a character of output index file to record the end point. If the program ends unexpectedly, the end point can help GRAB package understand where to restart the analysis. If NULL (default), OutputFileIndex = paste0(OutputFile, ".index").

GroupFile

a character of region file to specify region-marker mapping with annotation information. Each region includes two or three rows. Only alphabet, numbers, and :,_+- symbols are supported. Columns are separated by 'tab'.

SparseGRMFile

a character of sparseGRM file. An example is system.file("SparseGRM","SparseGRM.txt",package="GRAB").

SampleFile

a character of file to include sample information with header.

MaxMAFVec

a character of multiple max MAF cutoffs (comma separated) to include markers for region-level analysis. Default value is "0.05,0.01,0.005".

annoVec

a character of multiple annotation groups (comma separated) to include markers for region-level analysis. Default value is "lof,lof:missense,lof:missense:synonymous".

chrom

to be continued

control

a list of parameters for controlling function GRAB.Region, more details can be seen in Details section.

Details

GRAB package supports POLMM, SPACox, SPAGRM, SPAmix, and WtCoxG methods. Detailed information about the analysis methods is given in the Details section of GRAB.NullModel. Users do not need to specify them since functions GRAB.Marker and GRAB.Region will check the class(objNull).

The following details are about argument control

For PLINK files, the default control$AlleleOrder = "alt-first"; for BGEN files, the default control$AlleleOrder = "ref-first".

  • AlleleOrder: please refer to the Details section of GRAB.ReadGeno.

The below is to customize the quality-control (QC) process.

  • omp_num_threads: (To be added later) a numeric value (default: value from data.table::getDTthreads()) to specify the number of threads in OpenMP for parallel computation.

  • ImputeMethod: a character, "mean", "bestguess" (default), or "drop" (to be added later). Please refer to the Details section of GRAB.ReadGeno.

  • MissingRateCutoff: a numeric value (default=0.15). Markers with missing rate > this value will be excluded from analysis.

  • MinMACCutoff: a numeric value (default=5). Markers with MAC < this value will be treated as Ultra-Rare Variants (URV) and collapsed as one value.

  • nRegionsEachChunk: number of regions (default=1) in one chunk to output.

The below is for kernel-based approaches including SKAT and SKAT-O. For more details, please refer to the SKAT package.

  • kernel: a type of kernel (default="linear.weighted").

  • weights_beta: a numeric vector of parameters for the beta weights for the weighted kernels (default=c(1, 25)). If you want to use your own weights, please use the control$weights parameter. It will be ignored if control$weights parameter is not NULL.

  • weights: a numeric vector of weights for the weighted kernels. If it is NULL (default), the beta weight with the control$weights.beta parameter is used.

  • r.corr: the rho parameter for the compound symmetric correlation structure kernels. If you give a vector value, SKAT will conduct the optimal test. It will be ignored if method="optimal" or method="optimal.adj" (default=c(0, 0.1^2, 0.2^2, 0.3^2, 0.4^2, 0.5^2, 0.5, 1)).

The below is to customize the columns in the OutputMarkerFile. Columns of Marker, Info, AltFreq, AltCounts, MissingRate, Pvalue are included for all methods.

  • outputColumns: For example, for POLMM method, users can set control$outputColumns = c("beta", "seBeta", "AltFreqInGroup"):

    • POLMM: Default: beta, seBeta; Optional: zScore, AltFreqInGroup, nSamplesInGroup, AltCountsInGroup

    • SPACox: Optional: zScore

Examples

Run this code
# Load a precomputed example object to perform step 2 without repeating step 1
objNullFile <- system.file("extdata", "objPOLMMnull.RData", package = "GRAB")
load(objNullFile)
class(obj.POLMM) # "POLMM_NULL_Model" is an object from POLMM method.

OutputDir <- tempdir()
OutputFile <- file.path(OutputDir, "simuRegionOutput.txt")
GenoFile <- system.file("extdata", "simuPLINK_RV.bed", package = "GRAB")
GroupFile <- system.file("extdata", "simuPLINK_RV.group", package = "GRAB")
SparseGRMFile <- system.file("extdata", "SparseGRM.txt", package = "GRAB")

GRAB.Region(
  objNull = obj.POLMM,
  GenoFile = GenoFile,
  OutputFile = OutputFile,
  GroupFile = GroupFile,
  SparseGRMFile = SparseGRMFile,
  MaxMAFVec = "0.01,0.005"
)

data.table::fread(OutputFile)
data.table::fread(paste0(OutputFile, ".markerInfo"))
data.table::fread(paste0(OutputFile, ".otherMarkerInfo"))
data.table::fread(paste0(OutputFile, ".index"), sep = "\t", header = FALSE)

Run the code above in your browser using DataLab