This function queries the SNP Annotation and Proxy tool (SNAP) for SNPs in high linkage disequilibrium with a set of SNPs, and also merges in up-to-date SNP annotation information available from NCBI.
LDSearch(SNPs, dataset = "onekgpilot", panel = "CEU", RSquaredLimit = 0.8,
distanceLimit = 500, GeneCruiser = TRUE, quiet = FALSE, ...)ld_search(SNPs, dataset = "onekgpilot", panel = "CEU",
RSquaredLimit = 0.8, distanceLimit = 500, GeneCruiser = TRUE,
quiet = FALSE, ...)
A vector of SNPs (rs numbers).
The dataset to query. Must be one of:
rel21:
HapMap Release 21
rel22:
HapMap Release 22
hapmap3r2:
HapMap 3 (release 2)
onekgpilot:
1000 Genomes Pilot 1
The panel to use from the queried data set. Must be one of:
CEU
YRI
JPT+CHB
If you are working with hapmap3r2
, you can choose
the additional panels:
ASW
CHD
GIH
LWK
MEK
MKK
TSI
CEU+TSI
JPT+CHB+CHD
The R Squared limit to specify as a filter for returned
SNPs; that is, only SNP pairs with R-squared greater than RSquaredLimit
will be returned.
The distance (in kilobases) upstream and downstream to search for SNPs in LD with each set of SNPs.
boolean; if TRUE
we attempt to get gene info through
GeneCruiser for each SNP. This can slow the query down substantially.
boolean; if TRUE
progress updates are written to the
console.
Curl options passed on to GET
A list of data frames, one for each SNP queried, containing information about the SNPs found to be in LD with that SNP. A description of the columns follows:
Proxy:
The proxy SNP matched to the queried SNP.
SNP:
The SNP queried.
Distance:
The distance, in base pairs, between the queried SNP
and the proxy SNP. This distance is calculated according to up-to-date position
information returned from NCBI.
RSquared:
The measure of LD between the SNP and the proxy.
DPrime:
Another measure of LD between the SNP and the proxy.
GeneVariant:
Present if GeneCruiser
is TRUE
.
This will identify where the SNP lies relative to its 'parent' SNP.
GeneName:
Present if GeneCruiser
is TRUE
.
If the proxy SNP found lies within a gene, the name of that
gene will be returned here. Otherwise, the field is N/A
.
GeneDescription:
Present if GeneCruiser
is TRUE
. If
the proxy SNP lies within a gene, information about that gene (as
obtained from GeneCruiser) will be available here.
Major:
The major allele, as reported by SNAP.
Minor:
The minor allele, as reported by SNAP.
MAF:
The minor allele frequency corresponding to the reference
panel queried, as obtained through SNAP.
NObserved:
The number of individuals from which the MAF
information is generated, for column MAF
.
Chromosome_NCBI:
The chromosome that the marker lies on.
Marker_NCBI:
The name of the marker. If the rs ID queried
has been merged, the up-to-date name of the marker is returned here, and
a warning is issued.
Class_NCBI:
The marker's 'class'. See
https://www.ncbi.nlm.nih.gov/projects/SNP/snp_legend.cgi?legend=snpClass
for more details.
Gene_NCBI:
If the marker lies within a gene (either within the exon
or introns of a gene), the name of that gene is returned here; otherwise,
NA
. Note that
the gene may not be returned if the marker lies too far upstream or downstream
of the particular gene of interest.
Alleles_NCBI:
The alleles associated with the SNP if it is a
SNV; otherwise, if it is an INDEL, microsatellite, or other kind of
polymorphism the relevant information will be available here.
Major_NCBI:
The major allele of the SNP, on the forward strand,
given it is an SNV; otherwise, NA
.
Minor_NCBI:
The minor allele of the SNP, on the forward strand,
given it is an SNV; otherwise, NA
.
MAF_NCBI:
The minor allele frequency of the SNP, given it is an SNV.
This is drawn from the current global reference population used by NCBI.
BP_NCBI:
The chromosomal position, in base pairs, of the marker,
as aligned with the current genome used by dbSNP.
For more details, please see http://archive.broadinstitute.org/mpg/snap/ldsearch.php.
Information on the HapMap populations: https://catalog.coriell.org/0/Sections/Collections/NHGRI/hapmap.aspx?PgId=266&coll=HG
Information on the 1000 Genomes populations: http://www.internationalgenome.org/category/population
# NOT RUN {
ld_search("rs420358")
ld_search('rs2836443')
ld_search('rs113196607')
# }
Run the code above in your browser using DataLab