Learn R Programming

rsnps (version 0.1.6)

LDSearch: Search for SNPs in Linkage Disequilibrium with a set of SNPs

Description

This function queries the SNP Annotation and Proxy tool (SNAP) for SNPs in high linkage disequilibrium with a set of SNPs, and also merges in up-to-date SNP annotation information available from NCBI.

Usage

LDSearch(SNPs, dataset = "onekgpilot", panel = "CEU", RSquaredLimit = 0.8, distanceLimit = 500, GeneCruiser = TRUE, quiet = FALSE)

Arguments

SNPs
A vector of SNPs (rs numbers).
dataset
The dataset to query. Must be one of:
  • rel21: HapMap Release 21
  • rel22: HapMap Release 22
  • hapmap3r2: HapMap 3 (release 2)
  • onekgpilot: 1000 Genomes Pilot 1
panel
The panel to use from the queried data set. Must be one of:
  • CEU
  • YRI
  • JPT+CHB

If you are working with hapmap3r2, you can choose the additional panels:

  • ASW
  • CHD
  • GIH
  • LWK
  • MEK
  • MKK
  • TSI
  • CEU+TSI
  • JPT+CHB+CHD

RSquaredLimit
The R Squared limit to specify as a filter for returned SNPs; that is, only SNP pairs with R-squared greater than RSquaredLimit will be returned.
distanceLimit
The distance (in kilobases) upstream and downstream to search for SNPs in LD with each set of SNPs.
GeneCruiser
boolean; if TRUE we attempt to get gene info through GeneCruiser for each SNP. This can slow the query down substantially.
quiet
boolean; if TRUE progress updates are written to the console.

Value

A list of data frames, one for each SNP queried, containing information about the SNPs found to be in LD with that SNP. A description of the columns follows:
  • Proxy: The proxy SNP matched to the queried SNP.
  • SNP: The SNP queried.
  • Distance: The distance, in base pairs, between the queried SNP and the proxy SNP. This distance is calculated according to up-to-date position information returned from NCBI.
  • RSquared: The measure of LD between the SNP and the proxy.
  • DPrime: Another measure of LD between the SNP and the proxy.
  • GeneVariant: Present if GeneCruiser is TRUE. This will identify where the SNP lies relative to its 'parent' SNP.
  • GeneName: Present if GeneCruiser is TRUE. If the proxy SNP found lies within a gene, the name of that gene will be returned here. Otherwise, the field is N/A.
  • GeneDescription: Present if GeneCruiser is TRUE. If the proxy SNP lies within a gene, information about that gene (as obtained from GeneCruiser) will be available here.
  • Major: The major allele, as reported by SNAP.
  • Minor: The minor allele, as reported by SNAP.
  • MAF: The minor allele frequency corresponding to the reference panel queried, as obtained through SNAP.
  • NObserved: The number of individuals from which the MAF information is generated, for column MAF.
  • Chromosome_NCBI: The chromosome that the marker lies on.
  • Marker_NCBI: The name of the marker. If the rs ID queried has been merged, the up-to-date name of the marker is returned here, and a warning is issued.
  • Class_NCBI: The marker's 'class'. See http://www.ncbi.nlm.nih.gov/projects/SNP/snp_legend.cgi?legend=snpClass for more details.
  • Gene_NCBI: If the marker lies within a gene (either within the exon or introns of a gene), the name of that gene is returned here; otherwise, NA. Note that the gene may not be returned if the marker lies too far upstream or downstream of the particular gene of interest.
  • Alleles_NCBI: The alleles associated with the SNP if it is a SNV; otherwise, if it is an INDEL, microsatellite, or other kind of polymorphism the relevant information will be available here.
  • Major_NCBI: The major allele of the SNP, on the forward strand, given it is an SNV; otherwise, NA.
  • Minor_NCBI: The minor allele of the SNP, on the forward strand, given it is an SNV; otherwise, NA.
  • MAF_NCBI: The minor allele frequency of the SNP, given it is an SNV. This is drawn from the current global reference population used by NCBI.
  • BP_NCBI: The chromosomal position, in base pairs, of the marker, as aligned with the current genome used by dbSNP.

Details

For more details, please see http://www.broadinstitute.org/mpg/snap/ldsearch.php.

Information on the HapMap populations: http://ccr.coriell.org/Sections/Collections/NHGRI/hapmap.aspx?PgId=266&coll=HG

Information on the 1000 Genomes populations: http://www.1000genomes.org/category/frequently-asked-questions/population

Examples

Run this code
## Not run: 
# LDSearch("rs420358")
# LDSearch('rs2836443')
# LDSearch('rs113196607')
# ## End(Not run)

Run the code above in your browser using DataLab