Learn R Programming

rsnps (version 0.2.0)

LDSearch: Search for SNPs in Linkage Disequilibrium with a set of SNPs

Description

This function queries the SNP Annotation and Proxy tool (SNAP) for SNPs in high linkage disequilibrium with a set of SNPs, and also merges in up-to-date SNP annotation information available from NCBI.

Usage

LDSearch(SNPs, dataset = "onekgpilot", panel = "CEU", RSquaredLimit = 0.8,
  distanceLimit = 500, GeneCruiser = TRUE, quiet = FALSE, ...)

ld_search(SNPs, dataset = "onekgpilot", panel = "CEU", RSquaredLimit = 0.8, distanceLimit = 500, GeneCruiser = TRUE, quiet = FALSE, ...)

Arguments

SNPs

A vector of SNPs (rs numbers).

dataset

The dataset to query. Must be one of:

  • rel21: HapMap Release 21

  • rel22: HapMap Release 22

  • hapmap3r2: HapMap 3 (release 2)

  • onekgpilot: 1000 Genomes Pilot 1

panel

The panel to use from the queried data set. Must be one of:

  • CEU

  • YRI

  • JPT+CHB

If you are working with hapmap3r2, you can choose the additional panels:

  • ASW

  • CHD

  • GIH

  • LWK

  • MEK

  • MKK

  • TSI

  • CEU+TSI

  • JPT+CHB+CHD

RSquaredLimit

The R Squared limit to specify as a filter for returned SNPs; that is, only SNP pairs with R-squared greater than RSquaredLimit will be returned.

distanceLimit

The distance (in kilobases) upstream and downstream to search for SNPs in LD with each set of SNPs.

GeneCruiser

boolean; if TRUE we attempt to get gene info through GeneCruiser for each SNP. This can slow the query down substantially.

quiet

boolean; if TRUE progress updates are written to the console.

...

Curl options passed on to GET

Value

A list of data frames, one for each SNP queried, containing information about the SNPs found to be in LD with that SNP. A description of the columns follows:

  • Proxy: The proxy SNP matched to the queried SNP.

  • SNP: The SNP queried.

  • Distance: The distance, in base pairs, between the queried SNP and the proxy SNP. This distance is calculated according to up-to-date position information returned from NCBI.

  • RSquared: The measure of LD between the SNP and the proxy.

  • DPrime: Another measure of LD between the SNP and the proxy.

  • GeneVariant: Present if GeneCruiser is TRUE. This will identify where the SNP lies relative to its 'parent' SNP.

  • GeneName: Present if GeneCruiser is TRUE. If the proxy SNP found lies within a gene, the name of that gene will be returned here. Otherwise, the field is N/A.

  • GeneDescription: Present if GeneCruiser is TRUE. If the proxy SNP lies within a gene, information about that gene (as obtained from GeneCruiser) will be available here.

  • Major: The major allele, as reported by SNAP.

  • Minor: The minor allele, as reported by SNAP.

  • MAF: The minor allele frequency corresponding to the reference panel queried, as obtained through SNAP.

  • NObserved: The number of individuals from which the MAF information is generated, for column MAF.

  • Chromosome_NCBI: The chromosome that the marker lies on.

  • Marker_NCBI: The name of the marker. If the rs ID queried has been merged, the up-to-date name of the marker is returned here, and a warning is issued.

  • Class_NCBI: The marker's 'class'. See https://www.ncbi.nlm.nih.gov/projects/SNP/snp_legend.cgi?legend=snpClass for more details.

  • Gene_NCBI: If the marker lies within a gene (either within the exon or introns of a gene), the name of that gene is returned here; otherwise, NA. Note that the gene may not be returned if the marker lies too far upstream or downstream of the particular gene of interest.

  • Alleles_NCBI: The alleles associated with the SNP if it is a SNV; otherwise, if it is an INDEL, microsatellite, or other kind of polymorphism the relevant information will be available here.

  • Major_NCBI: The major allele of the SNP, on the forward strand, given it is an SNV; otherwise, NA.

  • Minor_NCBI: The minor allele of the SNP, on the forward strand, given it is an SNV; otherwise, NA.

  • MAF_NCBI: The minor allele frequency of the SNP, given it is an SNV. This is drawn from the current global reference population used by NCBI.

  • BP_NCBI: The chromosomal position, in base pairs, of the marker, as aligned with the current genome used by dbSNP.

Details

For more details, please see http://archive.broadinstitute.org/mpg/snap/ldsearch.php.

Information on the HapMap populations: https://catalog.coriell.org/0/Sections/Collections/NHGRI/hapmap.aspx?PgId=266&coll=HG

Information on the 1000 Genomes populations: http://www.internationalgenome.org/category/population

Examples

Run this code
# NOT RUN {
ld_search("rs420358")
ld_search('rs2836443')
ld_search('rs113196607')
# }

Run the code above in your browser using DataLab