Learn R Programming

polyRAD (version 1.3)

ExpectedHindHe: Simulate Data to Get Expected Distribution of Hind/He

Description

These functions were created to help users determine an appropriate cutoff for filtering loci based on \(H_{ind}/H_E\) after running HindHe and InbreedingFromHindHe. ExpectedHindHe takes allele frequencies, sample size, and read depths from a RADdata object, simulates genotypes and allelic read depths from these assuming Mendelian inheritance, and then estimates \(H_{ind}/H_E\) for each simulated locus. SimGenotypes and SimAlleleDepth are internal functions used by ExpectedHindHe but are provided at the user level since they may be more broadly useful.

Usage

ExpectedHindHe(object, ploidy = object$possiblePloidies[[1]], inbreeding = 0,
               overdispersion = 20, reps = ceiling(5000/nLoci(object)),
               quiet = FALSE, plot = TRUE)

SimGenotypes(alleleFreq, alleles2loc, nsam, inbreeding, ploidy)

SimAlleleDepth(locDepth, genotypes, alleles2loc, overdispersion = 20)

Arguments

object

A RADdata object.

ploidy

A single integer indicating the ploidy to use for genotype simulation.

inbreeding

A number ranging from 0 to 1 indicating the amount of inbreeding (\(F\)). This represents inbreeding from all sources (population structure, self-fertilization, etc.) and can be estimated with InbreedingFromHindHe.

overdispersion

Overdispersion parameter as described in AddGenotypeLikelihood. Lower values will cause allelic read depth distributions to deviate further from expectations based on allele copy number.

reps

The number of times to simulate the data and estimate \(H_{ind}/H_E\). This can generally be left at the default, but set it higher than 1 if you want to see within-locus variance in the estimate.

quiet

Boolean indicating whether to suppress messages and results printed to console.

plot

Boolean indicating whether to plot a histogram of \(H_{ind}/H_E\) values.

alleleFreq

A vector of allele frequencies, as can be found in the $alleleFreq slot of a RADdata object after running AddAlleleFreqHWE.

alleles2loc

An integer vector assigning alleles to loci, as can be found in the $alleles2loc slot of a RADdata object.

nsam

An integer indicating the number of samples (number of taxa) to simulate.

locDepth

An integer matrix indicating read depth at each taxon and locus. Formatted as the $locDepth slot of a RADdata object, notably with columns named by locus number rather than locus name.

genotypes

A numeric matrix, formatted as the output of GetProbableGenotypes or SimGenotypes, indicating genotypes as allele copy number.

Value

ExpectedHindHe invisibly returns a matrix, with loci in rows and reps in columns, containing \(H_{ind}/H_E\) from the simulated loci.

SimGenotypes returns a numeric matrix of allele copy number, with samples in rows and alleles in columns, similar to that produced by GetProbableGenotypes.

SimAlleleDepth returns an integer matrix of allelic read depth, with samples in rows and alleles in columns, similar to the $alleleDepth slot of a RADdata object.

Examples

Run this code
# NOT RUN {
# Load dataset for the example
data(exampleRAD)
exampleRAD <- AddAlleleFreqHWE(exampleRAD)

# Simulate genotypes
simgeno <- SimGenotypes(exampleRAD$alleleFreq, exampleRAD$alleles2loc, 10, 0.2, 2)

# Simulate reads
simreads <- SimAlleleDepth(exampleRAD$locDepth[1:10,], simgeno, exampleRAD$alleles2loc)

# Get expected Hind/He distribution if all loci in exampleRAD were well-behaved
ExpectedHindHe(exampleRAD, reps = 10)
# }

Run the code above in your browser using DataLab