Learn R Programming

diemr (version 1.4.3)

rank2map: Convert SNP Ranks To Windows Corresponding to Mapping Distance

Description

This function estimates positions of ordered single nucleotide polymorphisms (SNPs) that correspond to a window spanning a user-defined distance in the SNP positions mapped to a reference. Each window is centered at the SNP mapped position. Conversion of a SNP rank position metric to a mapped position metric is useful for kernel smoothing of the diem output state along a genomic sequence.

Usage

rank2map(includedSites, ChosenSites = "all", windowSize = 1e+07, nCores = 1)

Value

A two-column matrix with the number of rows corresponding to the number of ChosenSites, indicating start and end indices of adjacent markers that are within an interval of length windowSize centered on the specific marker.

Arguments

includedSites

A character path to a file with columns CHROM and POS.

ChosenSites

A logical vector indicating which sites are to be included in the analysis.

windowSize

A numeric window size for metric conversion in base-pairs.

nCores

A numeric number of cores to be used for parallelisation. Must be nCores = 1 on Windows.

Author

Natalia Martinkova

Filip Jagos 521160@mail.muni.cz

Details

Single nucleotide polymorphisms (SNPs) tend to be spread across a genome randomly. To facilitate interpretation of the diem output, the marker states should be assessed on the metric of their position along chromosomes (contigs). The windows for kernel smoothing might contain a variable number of markers. This function estimates which markers should be assessed together given their proximity on a chromosome.

Values in includedSites are in essence SNP positions in BED format with a header. The includedSites file should ideally be generated by vcf2diem to ensure congruence across all analyses.

The function reads SNP positions from the specified BED-like file and divides the genome into segments based on chromosomes. Each segment is then processed to identify genomic windows encompassing each SNP, considering the specified window size. This process is parallelized to enhance performance, and each SNP is considered within its chromosomal context to ensure accurate window placement.

Minimum value of windowSize is equal to 3, but in genomic data evaluations, window size should be at least two orders of magnitude larger. A good approximation of a useful minimum window size is $(genome size) / ((number of SNPSs) / 2)$. Throughout the diemr package, windowSize refers to the genomic context of the respective SNP that the user wishes to consider when smoothing over the polarized genomic states.

See Also

smoothPolarizedGenotypes

Examples

Run this code
 if (FALSE) {
 # Run this example in a working directory with write permissions
 myo <- system.file("extdata", "myotis.vcf", package = "diemr")
 vcf2diem(myo, "myo")
 rank2map("myo-includedSites.txt", windowSize = 50)
 } 

Run the code above in your browser using DataLab