Learn R Programming

BIGr (version 0.6.2)

imputation_concordance: Calculate Concordance between Imputed and Reference Genotypes

Description

This function calculates the concordance between imputed and reference genotypes. It assumes that samples are rows and markers are columns. It is recommended to use allele dosages (0, 1, 2) but will work with other formats. Missing data in reference or imputed genotypes will not be considered for concordance if the missing_code argument is used. If a specific subset of markers should be excluded, it can be provided using the snps_2_exclude argument.

Usage

imputation_concordance(
  reference_genos,
  imputed_genos,
  missing_code = NULL,
  snps_2_exclude = NULL,
  verbose = FALSE
)

Value

A list with two elements:

  • result_df: A data frame with sample IDs and their concordance percentages.

  • summary_concordance: A summary of concordance percentages, including minimum, maximum, mean, and quartiles.

Arguments

reference_genos

A data frame containing reference genotype data, with rows as samples and columns as markers. Dosage format (0, 1, 2) is recommended.

imputed_genos

A data frame containing imputed genotype data, with rows as samples and columns as markers. Dosage format (0, 1, 2) is recommended.

missing_code

An optional value to specify missing data. If provided, loci with this value in either dataset will be excluded from the concordance calculation.

snps_2_exclude

An optional vector of marker IDs to exclude from the concordance calculation.

verbose

A logical value indicating whether to print a summary of the concordance results. Default is FALSE.

Details

The function identifies common samples and markers between the reference and imputed genotype datasets. It calculates the percentage of matching genotypes for each sample, excluding missing data and specified markers. The concordance is reported as a percentage for each sample, along with a summary of the overall concordance distribution.

Examples

Run this code

# Example Input variables
ignore_file <- system.file("imputation_ignore.txt", package="BIGr")
ref_file <- system.file("imputation_reference.txt", package="BIGr")
test_file <- system.file("imputation_test.txt", package="BIGr")

# Import files
snps = read.table(ignore_file, header = TRUE)
ref = read.table(ref_file, header = TRUE)
test = read.table(test_file, header = TRUE)

#Calculations
result <- imputation_concordance(reference_genos = ref,
                                 imputed_genos = test,
                                 snps_2_exclude = snps,
                                 missing_code = 5,
                                 verbose = FALSE)



Run the code above in your browser using DataLab