Learn R Programming

plinkQC (version 1.0.0)

Genotype Quality Control with 'PLINK'

Description

Genotyping arrays enable the direct measurement of an individuals genotype at thousands of markers. 'plinkQC' facilitates genotype quality control for genetic association studies as described by Anderson and colleagues (2010) . It makes 'PLINK' basic statistics (e.g. missing genotyping rates per individual, allele frequencies per genetic marker) and relationship functions accessible from 'R' and generates a per-individual and per-marker quality control report. Individuals and markers that fail the quality control can subsequently be removed to generate a new, clean dataset. Removal of individuals based on relationship status is optimised to retain as many individuals as possible in the study. Additionally, there is a trained classifier to predict genomic ancestry of human samples.

Copy Link

Version

Install

install.packages('plinkQC')

Monthly Downloads

381

Version

1.0.0

License

MIT + file LICENSE

Maintainer

Hannah Meyer

Last Published

November 25th, 2025

Functions in plinkQC (1.0.0)

overviewPerIndividualQC

Overview of per sample QC
evaluate_check_sex

Evaluate results from PLINK sex check.
cleanData

Create plink dataset with individuals and markers passing quality control
overviewPerMarkerQC

Overview of per marker QC
evaluate_check_het_and_miss

Evaluate results from PLINK missing genotype and heterozygosity rate check.
check_sex

Identification of individuals with discordant sex information
evaluate_check_relatedness

Evaluate results from PLINK IBD estimation.
convert_to_plink2

Converting PLINK v1.9 data files into PLINK v2.0 data files
evaluate_ancestry_prediction

Predicting sample superpopulation ancestry
check_snp_missingness

Identification of SNPs with high missingness rate
relatednessFilter

Remove related individuals while keeping maximum number of individuals
run_check_heterozygosity

Run PLINK heterozygosity rate calculation
run_ancestry_prediction

Projecting the study data set onto the PC space of the reference dataset
run_ancestry_format

Running functions to format data for ancestry prediction
run_check_missingness

Run PLINK missingness rate calculation
plinkQC-package

plinkQC: Genotype Quality Control with 'PLINK'
rename_variant_identifiers

Renaming variants
pruning_ld

Pruning of SNPs in Linkage Disequilibrium
perMarkerQC

Quality control for all markers in plink-dataset
perIndividualQC

Quality control for all individuals in plink-dataset
testNumerics

Test lists for different properties of numerics
run_check_relatedness

Run PLINK IBD estimation
run_check_sex

Run PLINK sexcheck
checkFiltering

Check and construct PLINK sample and marker filters
checkRemoveIDs

Check and construct individual IDs to be removed
checkPlink2

Check PLINK2 software access
check_maf

Identification of SNPs with low minor allele frequency
check_het_and_miss

Identification of individuals with outlying missing genotype or heterozygosity rates
ancestry_prediction

Predicting sample superpopulation ancestry
checkPlink

Check PLINK software access
checkLoadingMat

Checking the path of the loading matrix
check_hwe

Identification of SNPs showing a significant deviation from Hardy-Weinberg- equilibrium (HWE)
check_relatedness

Identification of related individuals