Learn R Programming

kindisperse (version 0.10.2)

sample_kindist: Subsample and filter a KinPairSimulation or KinPairData Object

Description

This function takes a pre-existing KinPairSimulation or KinPairData Object with distance and coordinate data and filters it to simulate various in-field sampling schemes.

Usage

sample_kindist(
  kindist,
  upper = NULL,
  lower = NULL,
  spacing = NULL,
  n = NULL,
  dims = NULL
)

Arguments

kindist

KinPairSimulation Class Object

upper

numeric - upper cutoff for kin pair distances

lower

numeric - lower cutoff for kin pair distances

spacing

numeric - spacing between traps (location-independent)

n

numeric - number of individuals to keep after filtering (if possible)

dims

dimensions to sample within (works with the KinPairSimulation spatial & dimension information). Either num (defining a square) or c(num1, num2) (defining a rectangle).

Value

returns an object of class KinPairData or KinPairSimulation containing simulation and filtering details and a filtered dataset of dispersed individuals.

Details

This function enables the testing of the impact of some basic sampling constraints that might be encountered in study design or implementation on the effectiveness of the kindisperse estimation of intergenerational dispersal. It is typically paired with a simulation function such as simulate_kindist_composite to generate a 'pure' dataset, then an estimation function such as axpermute to examine the impact of filter settings on the 'detected' value of dispersal sigma. The filter parameters upper, lower, & spacing all work on the vector of (direction-independent) distances, & the parameter n enables the random subsampling of n kin dyads. The parameter dims requires 2D location information for each individual, meaning it can ordinarily only be used with the KinPairSimulation object (not KinPairData). All filter parameters are stackable.

The upper parameter implements a cutoff for the maximum distance allowable in the dataset. If set to e.g. 100m, all kin dyads separated by a distance greater than 100m will be excluded from the filtered dataset. Note that this is a geometry-independent metric; it is naive to the edge effects of an actual sample site. The lower parameter implements a cutoff for the minimum distance allowable in the dataset.It operates in the same manner as the previous parameter (in this case, removing results smaller than a distance threshold)

The spacing parameter as currently implemented takes all distances & alters them to lie at the midpoint of a bin with width set by this parameter. So if spacing is set to 10 meters, all kin pairs with distances between 0 and 10m will have their distances rest to 5m, all between 10 & 20 will be set to 15 m, etc. (quantizing the data). Note that once again this is a geometry-independent action: These binwiths & 'trap spacing' are not spatially related to each other like they would be in a sample site, and there is no simulated dropout of kinpairs too far from a trap. There is also no geometry-dependent profiling of possible frequency of recaptures across each distance category (will be implemented in a future version). (this parameter leaves 2D spatial information intact)

The dims parameter defines the dimensions of a rectangle within which both individuals of a kin dyad will need to lie to be included in the filtered dataset. This measure (which excludes e.g. long-distance dispersal into & out of the study site) is geometry-dependent, unlike the upper parameter. This enables the testing of (rectangular) site geometries potentially corresponding to an actual site (two-dimensional estimates of dispersal such as kindisperse become unreliable as edge effects significantly reduce the size of either one or both dimensions with respect to the real underlying dispersal sigma). These site geometries can be entered in a few ways: (a) a single numeric value, which will be interpreted as the length of the side of a square; (b) a numeric vector of length two, which will be interpreted as the length & width of the sample site; (c) either of the above passed to the elongate function, which takes the rectangular site dimensions and alters their aspect ratio (ratio of length to width) while preserving the underlying area the study site covers. The implementation of this filtering step permutes the absolute positions of all dyads so that at least one member of the dyad is in the inial site rectangle, while preserving their relative positions (and angles) with respect to each other. This means that following this step, the xy coordinate positions of each individual will not match those contained in the previous round. It also means that the repeated calling of this function will result in a steady reduction in retained kin dyads due to edge effects.

The n parameter randomly samples n pairs from the dataset. It is implemented after all other filtering has taken place, so will only sample surviving individuals A typical strategy for the use of this functions in simulations would be to simulate an extremely large (e.g. one million pairs) dataset, then pass it repeatedly to this filter function, with a final sub-sampling step of 1,000 included. This enables comparisons across sampling conditions (in most cases) regardless of the amount of data filtered prior to this step.

As this function returns a KinPairData or KinPairSimulation object, the returned object can be passed back for filtering an arbitrary number of times, or alternatively passed to an estimation strategy.

This function can be used to test for bias in the results of a close-kin dispersal study that has been conducted. After the field sampling, kin identification, & sigma calculation steps, use the estimated sigmas as inputs into simulation functions that are then filtered for size & geometry of the actual study site (via the dims method). Then pass this filtered dataset back to the sigma-determining functions. If filtering has resulted in a substantial drop in sigma, the estimate of sigma from the study site has likely been biased by the site geometry (note that the impact of this is dependent on the shape of the dispersal kernel - the more leptokurtic (dominated by long-distance dispersal), the more severe bias will be for a particular sigma and site geometry.

Examples

Run this code
# NOT RUN {
simobject <- simulate_kindist_simple(nsims = 100000, sigma = 100, kinship = "PO")

sample_kindist(simobject, upper = 200, lower = 50, spacing = 15, n = 100)
# }

Run the code above in your browser using DataLab