Learn R Programming

GRAB (version 0.2.4)

GRAB.SimuGMat: Simulate genotype data matrix for related and unrelated subjects

Description

Generates genotype data for association studies, supporting both unrelated subjects and family-based designs with various pedigree structures.

Usage

GRAB.SimuGMat(
  nSub,
  nFam,
  FamMode,
  nSNP,
  MaxMAF = 0.5,
  MinMAF = 0.05,
  MAF = NULL
)

Value

List containing:

GenoMat

Numeric genotype matrix (subjects × markers) with values 0, 1, 2.

markerInfo

Data frame with marker IDs and MAF values.

Arguments

nSub

Number of unrelated subjects. If 0, all subjects are related.

nFam

Number of families. If 0, all subjects are unrelated.

FamMode

Family structure: "4-members", "10-members", or "20-members". See Details for pedigree structures.

nSNP

Number of genetic markers to simulate.

MaxMAF

Maximum minor allele frequency for simulation (default: 0.5).

MinMAF

Minimum minor allele frequency for simulation (default: 0.05).

MAF

Optional vector of specific MAF values for each marker. If provided, MaxMAF and MinMAF are ignored.

Details

Genotypes are simulated under Hardy-Weinberg equilibrium with MAF ~ Uniform(MinMAF, MaxMAF).

Family Structures:

  • 4-members: 1+2→3+4 (parents 1,2 → offspring 3,4)

  • 10-members: 1+2→5+6, 3+5→7+8, 4+6→9+10

  • 20-members: Complex multi-generational pedigree with 20 members

Total subjects: nSub + nFam × family_size

See Also

GRAB.makePlink for converting to PLINK format.

Examples

Run this code
nSub <- 100
nFam <- 10
FamMode <- "10-members"
nSNP <- 10000
OutList <- GRAB.SimuGMat(nSub, nFam, FamMode, nSNP)
GenoMat <- OutList$GenoMat
markerInfo <- OutList$markerInfo
GenoMat[1:10, 1:10]
head(markerInfo)

## The following is to calculate GRM
MAF <- apply(GenoMat, 2, mean) / 2
GenoMatSD <- t((t(GenoMat) - 2 * MAF) / sqrt(2 * MAF * (1 - MAF)))
GRM <- GenoMatSD %*% t(GenoMatSD) / ncol(GenoMat)
GRM1 <- GRM[1:10, 1:10]
GRM2 <- GRM[100 + 1:10, 100 + 1:10]
GRM1
GRM2

Run the code above in your browser using DataLab