inbreedR (version 0.3.0)

simulate_r2_hf: Calculates the expected squared correlation between heteorzygosity and inbreeding for simulated marker sets

Description

This function can be used to simulate genotype data, draw random subsamples and calculate the expected squared correlations between heterozygosity and fitness ($r2(h, f)$). Every subset of markers is drawn independently to give insights into the variation and precision of $r2(h, f)$ calculated from a given number of markers and individuals.

Usage

simulate_r2_hf(n_ind = NULL, H_nonInb = 0.5, meanF = 0.2, varF = 0.03,
  subsets = NULL, reps = 100, type = c("msats", "snps"), CI = 0.95)

Arguments

n_ind
number of individuals to sample from the population
H_nonInb
true genome-wide heteorzygosity of a non-inbred individual
meanF
mean realized inbreeding f
varF
variance in realized inbreeding f
subsets
a vector specifying the sizes of marker-subsets to draw. Specifying subsets = c(2, 5, 10, 15, 20) would draw marker sets of 2 to 20 markers. The minimum number of markers is 2.
reps
number of resampling repetitions
type
specifies g2 formula. Type "snps" for large datasets and "msats" for smaller datasets.
CI
Confidence intervals to calculate (default to 0.95)

Value

  • simulate_r2_hf returns an object of class "inbreed". The functions `print` and `plot` are used to print a summary and to plot the r2(h, f) values with means and confidence intervals

    An `inbreed` object from simulate_g2 is a list containing the following components:

  • callfunction call.
  • estMatmatrix with all r2(h,f) estimates. Each row contains the values for a given subset of markers
  • n_indspecified number of individuals
  • subsetsvector specifying the marker sets
  • repsrepetitions per subset
  • H_nonInbtrue genome-wide heteorzygosity of a non-inbred individual
  • meanFmean realized inbreeding f
  • varFvariance in realized inbreeding f
  • min_valminimum g2 value
  • max_valmaximum g2 value
  • all_CIconfidence intervals for all subsets
  • all_sdstandard deviations for all subsets

Details

The simulate_r2_hf function simulates genotypes from which subsets of loci can be sampled independently. These simulations can be used to evaluate the effects of the number of individuals and loci on the precision and magnitude of the expected squared correlation between heterozygosity and inbreeding ($r2(h, f)$). The user specifies the number of simulated individuals (n_ind), the subsets of loci (subsets) to be drawn, the heterozygosity of non-inbred individuals (H_nonInb) and the distribution of f among the simulated individuals. The f values of the simulated individuals are sampled randomly from a beta distribution with mean (meanF) and variance (varF) specified by the user (e.g. as in wang2011). This enables the simulation to mimic populations with known inbreeding characteristics, or to simulate hypothetical scenarios of interest. For computational simplicity, allele frequencies are assumed to be constant across all loci and the simulated loci are unlinked. Genotypes (i.e. the heterozygosity/homozygosity status at each locus) are assigned stochastically based on the f values of the simulated individuals. Specifically, the probability of an individual being heterozygous at any given locus ($H$) is expressed as $H = H0(1-f)$ , where $H0$ is the user-specified heterozygosity of a non-inbred individual and f is an individual's inbreeding coefficient drawn from the beta distribution.

Examples

Run this code
data(mouse_msats)
genotypes <- convert_raw(mouse_msats)
sim_r2 <- simulate_r2_hf(n_ind = 10, H_nonInb = 0.5, meanF = 0.2, varF = 0.03,
                      subsets = c(4,6,8,10), reps = 100, 
                      type = "msats")
plot(sim_r2)

Run the code above in your browser using DataCamp Workspace