Learn R Programming

GRAB (version 0.2.3)

GRAB.Marker: Conduct marker-level genetic association testing

Description

Performs GWAS between a trait and individual genetic markers.

Usage

GRAB.Marker(
  objNull,
  GenoFile,
  GenoFileIndex = NULL,
  OutputFile,
  OutputFileIndex = NULL,
  control = NULL
)

Value

The analysis results are written to OutputFile, which includes the following columns:

Marker

Marker IDs extracted from GenoFile and GenoFileIndex.

Info

Marker information in format "CHR:POS:REF:ALT". The order of REF/ALT depends on control$AlleleOrder: "ref-first" or "alt-first".

AltFreq

Alternative allele frequency (before genotype imputation, might be > 0.5). If most markers have AltFreq > 0.5, consider resetting control$AlleleOrder.

AltCounts

Alternative allele counts (before genotype imputation).

MissingRate

Missing rate for each marker.

Pvalue

Association test p-value.

The following columns can be customized using control$outputColumns. See makeGroup for details about phenotype grouping, which is used for nSamplesInGroup, AltCountsInGroup, and AltFreqInGroup.

beta

Estimated effect size of the ALT allele.

seBeta

Estimated standard error of the effect size.

zScore

Standardized score statistic, usually follows a standard normal distribution.

nSamplesInGroup

Number of subjects in different phenotype groups. This may differ slightly from the original distribution due to missing genotypes.

AltCountsInGroup

Alternative allele counts (before genotype imputation) in different phenotype groups.

AltFreqInGroup

Alternative allele frequency (before genotype imputation) in different phenotype groups.

Arguments

objNull

The output object from function GRAB.NullModel.

GenoFile

A character string specifying the genotype file path. Currently, two genotype formats are supported: PLINK and BGEN. See GRAB.ReadGeno for details.

GenoFileIndex

Additional index files corresponding to GenoFile. If NULL (default), the same prefix as GenoFile is used. See GRAB.ReadGeno for details.

OutputFile

A character string specifying the output file path to save analysis results.

OutputFileIndex

A character string specifying the output index file to record the progress. If the program terminates unexpectedly, this helps GRAB understand where to restart the analysis. If NULL (default), OutputFileIndex = paste0(OutputFile, ".index").

control

A list of parameters for controlling GRAB.Marker function behavior. See the Details section for more information.

Details

The GRAB package supports multiple statistical methods: POLMM, SPACox, SPAGRM, SPAmix, and WtCoxG. Detailed information about these analysis methods is provided in the Details section of GRAB.NullModel. Users do not need to specify the method explicitly since GRAB.Marker and GRAB.Region automatically detect it from class(objNull).

Control Parameters

The following parameters allow users to customize which markers to include in the analysis. If these parameters are not specified, GRAB will analyze all markers in the file. For PLINK files, the default is control$AlleleOrder = "alt-first"; for BGEN files, the default is control$AlleleOrder = "ref-first".

  • IDsToIncludeFile: See the Details section of GRAB.ReadGeno.

  • IDsToExcludeFile: See the Details section of GRAB.ReadGeno.

  • RangesToIncludeFile: See the Details section of GRAB.ReadGeno.

  • RangesToExcludeFile: See the Details section of GRAB.ReadGeno.

  • AlleleOrder: See the Details section of GRAB.ReadGeno.

The following parameters customize the quality control (QC) process:

  • ImputeMethod: A character string specifying imputation method: "mean" (default), "bestguess", or "drop". See the Details section of GRAB.ReadGeno.

  • MissingRateCutoff: A numeric value (default=0.15). Markers with missing rate exceeding this value will be excluded from analysis.

  • MinMAFCutoff: A numeric value (default=0.001). Markers with minor allele frequency (MAF) below this value will be excluded from analysis.

  • MinMACCutoff: A numeric value (default=20). Markers with minor allele count (MAC) below this value will be excluded from analysis.

  • nMarkersEachChunk: Number of markers (default=10000) processed in each output chunk.

The following parameters customize the columns in the OutputFile. The columns Marker, Info, AltFreq, AltCounts, MissingRate, and Pvalue are included for all methods.

  • outputColumns: Specifies additional columns to include in the output. For example, for the POLMM method, users can set control$outputColumns = c("beta", "seBeta", "AltFreqInGroup"):

    • POLMM: Default: beta, seBeta; Optional: zScore, AltFreqInGroup, nSamplesInGroup, AltCountsInGroup

    • SPACox: Optional: zScore

Examples

Run this code
# Load a precomputed POLMM_NULL_Model object to perform step 2 without repeating step 1
objNullFile <- system.file("extdata", "objPOLMMnull.RData", package = "GRAB")
load(objNullFile)
class(obj.POLMM)

GenoFile <- system.file("extdata", "simuPLINK.bed", package = "GRAB")
OutputFile <- file.path(tempdir(), "simuOUTPUT.txt")
outputColumns <- c(
  "beta", "seBeta", "zScore", 
  "nSamplesInGroup", "AltCountsInGroup", "AltFreqInGroup"
)

GRAB.Marker(
  obj.POLMM,
  GenoFile = GenoFile,
  OutputFile = OutputFile,
  control = list(outputColumns = outputColumns)
)

data.table::fread(OutputFile)

Run the code above in your browser using DataLab