Learn R Programming

poolABC (version 1.0.0)

pickWindows: Randomly select blocks of a given size from several contigs

Description

Selects one random block of a smaller size from multiple larger contigs and obtain the index of the SNPs that are contained within that block.

Usage

pickWindows(freqs, positions, range, rMajor, rMinor, coverage, window, nLoci)

Value

a list with the following elements:

freqs

a list with the allele frequencies, computed by dividing the number of minor-allele reads by the total coverage. Each entry of this list corresponds to a different contig. Each entry is a matrix where each row is a different site and each column is a different population.

positions

a list with the positions of each SNP. Each entry of this list is a vector corresponding to a different contig.

rMajor

a list with the number of major-allele reads. Each entry of this list corresponds to a different contig. Each entry is a matrix where each row is a different site and each column is a different population.

rMinor

a list with the number of minor-allele reads. Each entry of this list corresponds to a different contig. Each entry is a matrix where each row is a different site and each column is a different population.

coverage

a list with the total coverage. Each entry of this list corresponds to a different contig. Each entry is a matrix where each row is a different site and each column is a different population.

Arguments

freqs

is a list containing the allelic frequencies. Each entry of that list should represent a different contig and be a matrix where each row corresponds to a different site and each column to a different population.

positions

is a list containing the position of the SNPs. Each entry should represent a different contig and be a vector containing the position of each SNP present in the contig.

range

is a list containing the range of the contig. Each entry should represent a different contig and be a vector with two entries: the first detailing the minimum position of the contig and the second the maximum position of the contig.

rMajor

is a list containing the number of major allele reads. Each entry of that list should represent a different contig and be a matrix where each row corresponds to a different site and each column to a different population.

rMinor

is a list containing the number of minor allele reads. Each entry of that list should represent a different contig and be a matrix where each row corresponds to a different site and each column to a different population.

coverage

is a list containing the depth of coverage. Each entry should represent a different contig and be a matrix with the sites as rows and the different populations as columns.

window

is a non-negative integer indicating the size, in base pairs, of the block of the contig to keep.

nLoci

is a non-negative integer indicating how many different contigs should be kept in the output. If each randomly selected window is a different loci, then how many different window should be selected?

Details

This function starts by removing the edges of the contigs. The size of the removed portion is equal to the size of the block to keep. Then, a SNP is randomly pick from the vector of all possible SNP positions. An initial block is constructed by selecting all SNPs contained in a window of window size, both upstream and downstream from that SNP. Finally, SNPs are removed from both ends of that initial block until all remaining SNPs are contained within a block of window size. All of these steps are performed for each of the contigs present in the dataset, obtaining one window per contig. Note that, in the end, only nLoci windows are kept.