Learn R Programming

HGNChelper (version 0.3.4)

checkGeneSymbols: function to identify outdated or Excel-mogrified gene symbols

Description

This function identifies gene symbols which are outdated or may have been mogrified by Excel or other spreadsheet programs. If output is assigned to a variable, it returns a data.frame of the same number of rows as the input, with a second column indicating whether the symbols are valid and a third column with a corrected gene list.

Usage

checkGeneSymbols(x, unmapped.as.na=TRUE, hgnc.table=NULL)

Arguments

x
Vector of gene symbols to check for mogrified or outdated values
unmapped.as.na
If TRUE, unmapped symbols will appear as NA in the Suggested.Symbol column. If FALSE, the original unmapped symbol will be kept when no correction can be found.
hgnc.table
If hgnc.table is a data.frame with colnames(hgnc.table) identical to c("Symbol", "Approved.Symbol"), it is used to correct gene symbols in x. Otherwise, the default table data("hgnc.table", package="HGNChelper") is used. The function used for creating this the hgnc.table dataframe can be found in the inst/hgncLookup.R file, in the source of this package. By default the hgnc.table dataframe shipped with this package is used (see ?hgnc.table)

Value

with corrections possible from hgnc.table.

See Also

hgnc.table

Examples

Run this code

x = c("FN1", "TP53", "UNKNOWNGENE","7-Sep", "9/7", "1-Mar", "Oct4", "4-Oct",
      "OCT4-PG4", "C19ORF71", "C19orf71")

res <- checkGeneSymbols(x)
res

if (interactive()){
    ##Run checkGeneSymbols with a brand-new map downloaded from HGNC:
    source(system.file("hgncLookup.R", package = "HGNChelper"))
    ##You should save this if you are going to use it multiple times,
    ##then load it from file rather than burdening HGNC's servers.
    ## save(hgnc.table, file="hgnc.table.rda", compress="bzip2")
    ## load("hgnc.table.rda")
    res <- checkGeneSymbols(x, hgnc.table=hgnc.table)
}

Run the code above in your browser using DataLab