Learn R Programming

BIGr (version 0.6.2)

check_homozygous_trios: Check Homozygous Loci in Trios

Description

This function analyzes homozygous loci segregation in trios (parents and progeny) using genotype data from a VCF file. It calculates the percentage of homozygous loci in the progeny that match the expected segregation patterns based on the tested parents.

Usage

check_homozygous_trios(
  path.vcf,
  ploidy = 4,
  parents_candidates = NULL,
  progeny_candidates = NULL,
  verbose = TRUE
)

Value

A data frame with the following columns:

  • parent1: The name of the first parent in the pair.

  • parent2: The name of the second parent in the pair.

  • progeny: The name of the progeny sample.

  • homoRef_x_homoRef_n: Number of loci where both parents are homozygous reference.

  • homoRef_x_homoRef_match: Percentage of matching loci in the progeny for homozygous reference parents.

  • homoAlt_x_homoAlt_n: Number of loci where both parents are homozygous alternate.

  • homoAlt_x_homoAlt_match: Percentage of matching loci in the progeny for homozygous alternate parents.

  • homoRef_x_homoAlt_n: Number of loci where one parent is homozygous reference and the other is homozygous alternate.

  • homoRef_x_homoAlt_match: Percentage of matching loci in the progeny for mixed homozygous parents.

  • homoalt_x_homoRef_n: Number of loci where one parent is homozygous alternate and the other is homozygous reference.

  • homoalt_x_homoRef_match: Percentage of matching loci in the progeny for mixed homozygous parents (alternate-reference).

  • missing: The number of loci with missing genotype data in the comparison.

Arguments

path.vcf

A string specifying the path to the VCF file containing genotype data.

ploidy

An integer specifying the ploidy level of the samples. Default is 4.

parents_candidates

A character vector of parent sample names to be tested. Must be provided.

progeny_candidates

A character vector of progeny sample names to be tested. Must be provided.

verbose

A logical value indicating whether to print the number of combinations tested. Default is TRUE.

Details

This function is designed to validate the segregation of homozygous loci in trios, ensuring that the progeny genotypes align with the expected patterns based on the parental genotypes. It requires both parent and progeny candidates to be specified. The function validates the ploidy level and ensures that all specified samples are present in the VCF file. The results include detailed statistics for each combination of parents and progeny. Reciprocal comparisons (e.g., A vs. B and B vs. A) and self-comparisons (e.g., A vs. A) are removed to avoid redundancy. Missing genotype data is also accounted for and reported in the results.

Examples

Run this code

# Example VCF file
example_vcf <- system.file("iris_DArT_VCF.vcf.gz", package = "BIGr")

parents_candidates <- paste0("Sample_",1:10)
progeny_candidates <- paste0("Sample_",11:20)

#Check homozygous loci in trios
check_tab <- check_homozygous_trios(path.vcf = example_vcf,
                                   ploidy = 2,
                                   parents_candidates = parents_candidates,
                                   progeny_candidates = progeny_candidates)

Run the code above in your browser using DataLab