rsnps (version 0.1.6)

NCBI_snp_query: Query NCBI's dbSNP for information on a set of SNPs

Description

This function queries NCBI's dbSNP for information related to the latest dbSNP build and latest reference genome for information on the vector of SNPs submitted.

Usage

NCBI_snp_query(SNPs, ...)

Arguments

SNPs
A vector of SNPs (rs numbers).
...
Further named parameters passed on to config to debug curl. See examples.

Value

A dataframe with columns:
  • Query: The rs ID that was queried.
  • Chromosome: The chromosome that the marker lies on.
  • Marker: The name of the marker. If the rs ID queried has been merged, the up-to-date name of the marker is returned here, and a warning is issued.
  • Class: The marker's 'class'. See http://www.ncbi.nlm.nih.gov/projects/SNP/snp_legend.cgi?legend=snpClass for more details.
  • Gene: If the marker lies within a gene (either within the exon or introns of a gene), the name of that gene is returned here; otherwise, NA. Note that the gene may not be returned if the marker lies too far upstream or downstream of the particular gene of interest.
  • Alleles: The alleles associated with the SNP if it is a SNV; otherwise, if it is an INDEL, microsatellite, or other kind of polymorphism the relevant information will be available here.
  • Major: The major allele of the SNP, on the forward strand, given it is an SNV; otherwise, NA.
  • Minor: The minor allele of the SNP, on the forward strand, given it is an SNV; otherwise, NA.
  • MAF: The minor allele frequency of the SNP, given it is an SNV. This is drawn from the current global reference population used by NCBI.
  • BP: The chromosomal position, in base pairs, of the marker, as aligned with the current genome used by dbSNP.

Details

Note that you are limited in the number of SNPs you pass in to one request because URLs can only be so long. Around 600 is likely the max you can pass in, though may be somewhat more. Break up your vector of SNP codes into pieces of 600 or less and do repeated requests to get all data.

References

http://www.ncbi.nlm.nih.gov/projects/SNP/

Examples

Run this code
## Not run: 
# ## an example with both merged SNPs, non-SNV SNPs, regular SNPs,
# ## SNPs not found, microsatellite
# SNPs <- c("rs332", "rs420358", "rs1837253", "rs1209415715", "rs111068718")
# NCBI_snp_query(SNPs)
# NCBI_snp_query("123456") ##invalid: must prefix with 'rs'
# NCBI_snp_query("rs420358")
# NCBI_snp_query("rs332") # warning that its merged into another, try that
# NCBI_snp_query("rs121909001")
# NCBI_snp_query("rs1837253")
# NCBI_snp_query("rs1209415715") # warning that no data available, returns 0 length data.frame
# NCBI_snp_query("rs111068718") # warning that chromosomal information may be unmapped
# 
# NCBI_snp_query(SNPs='rs9970807')$BP
# 
# # Curl debugging
# NCBI_snp_query("rs121909001")
# library("httr")
# NCBI_snp_query("rs121909001", config=verbose())
# snps <- c("rs332", "rs420358", "rs1837253", "rs1209415715", "rs111068718")
# NCBI_snp_query(snps, config=progress())
# ## End(Not run)

Run the code above in your browser using DataCamp Workspace