Learn R Programming

dartR.sim (version 0.71)

gl.report.nall: Report allelic retention and simulate a rarefaction curve

Description

This function reports per-population allele counts and simulates a rarefaction-style curve showing the proportion of the dataset’s total allelic diversity captured as progressively more individuals are sampled.

Usage

gl.report.nall(
  x,
  simlevels = seq(1, nInd(x), 5),
  reps = 10,
  plot.colors.pop = gl.colors("dis"),
  ncores = 2,
  plot.display = TRUE,
  plot.theme = theme_dartR(),
  plot.dir = NULL,
  plot.file = NULL,
  verbose = NULL
)

Value

A list with three elements:

  • `sim`: `data.frame` with columns `Npop` (sample size), `mnall` (mean proportion retained), `low` (minimum), and `high` (maximum) across replicates.

  • `points`: `data.frame` with observed per-population values at their actual sample sizes (columns include `popname`, `Npop`, and scaled `N.all`).

  • `p1`: a `ggplot` object showing the rarefaction curve, uncertainty ribbon, and per-population points.

Arguments

x

Name of the genlight/dartR object containing the SNP data. The object needs to have no missing data as subsampling from missing data is not possible. So we recommend to filter by callrate using a threshold of 1 [required].

simlevels

A vector that defines the different levels the combined population should be subsampled [default seq(1,nInd(x),5)].

reps

Number of replicate subsamples per sample size [default 10].

plot.colors.pop

A color palette for population plots or a list with as many colors as there are populations in the dataset [default gl.colors("dis")].

ncores

Number of cores to be used for parallel processing [default 10].

plot.display

Specify if plot is to be produced [default TRUE].

plot.theme

A `ggplot2` theme object for styling the plot [default theme_dartR()].

plot.dir

Directory to save the plot RDS files [default as specified by the global working directory or tempdir()].

plot.file

Filename (minus extension) for the RDS plot file [Required for plot save].

verbose

Verbosity: 0, silent or fatal errors; 1, begin and end; 2, progress log; 3, progress and results summary; 5, full report [default 2, unless specified using gl.set.verbosity].

Author

Custodian: Bernd Gruber -- Post to https://groups.google.com/d/forum/dartr

Details

The function estimates how sampling effort affects observed allelic diversity by repeatedly subsampling individuals from the pooled set of all individuals at user-defined sample sizes (`simlevels`), with each subsample replicated (`reps` times). The maximum attainable allele count is first determined by pooling all individuals into a single group; all simulation outputs and per-population observations are then normalized to this pooled maximum and expressed as a proportion of alleles retained.

For each target sample size, replicated subsamples are aggregated to yield the mean, minimum, and maximum proportions of alleles retained. A plot is produced showing (i) the mean rarefaction curve with an uncertainty ribbon (min–max across replicates) and (ii) points for each empirical population at its observed sample size and retained proportion.

How to use the output

- Assess genetic diversity and sampling sufficiency. The curve indicates how quickly allelic diversity accumulates with additional individuals, and where diminishing returns begin. - Interpret population points relative to the curve.

  • Above the curve: population retains more allelic diversity than expected for its sample size (e.g., unusually high diversity or more private/low-frequency alleles).

  • On/within the ribbon: diversity consistent with random sampling from the pooled dataset at that size.

  • Below the curve: population retains fewer alleles than expected, suggesting reduced diversity (e.g., drift, bottleneck), uneven missingness, or data-quality issues.

Examples

Run this code
# \donttest{
dummy <- gl.report.nall(possums.gl[c(1:5,31:35),], simlevels=seq(1,10,3),
reps=5, ncores=2)
# }

Run the code above in your browser using DataLab