Learn R Programming

⚠️There's a newer version (2.0.2) of this package.Take me there.

dbparser

The main purpose of the dbparser package is to parse the DrugBank database which is downloadable in XML format from this link. The parsed data can then be explored and analyzed as desired by the user. The dbparser package further provides the facility of saving the parsed data into a given database.

Installation

You can install the released version of dbparser from CRAN with:

install.packages("dbparser")

or you can install the latest updates directly from the repo

library(devtools)
devtools::install_github("Dainanahan/dbparser")

Example

This is a basic example which shows you how to solve a common problem:

## parse data from XML and save it to memory
dbparser::get_xml_db_rows(
              system.file("extdata", "drugbank_record.xml", package = "dbparser")
            )
#> [1] TRUE

## load drugs data
drugs <- dbparser::parse_drug()

## load drug groups data
drug_groups <- dbparser::parse_drug_groups()

## load drug targets actions data
drug_targets_actions <- dbparser::parse_drug_targets_actions()

Saving into a database

The parsed data may be saved into a given database. Databases supported by dbparser include MS SQL Server, MySQL and any database supported by DBI package. Following is an example of saving the parsed data into a MySQL database.

library(dbparser)

## open a connection to the desired database engine with an already
## created database
 open_db(xml_db_name =  "drugbank.xml", driver = "SQL Server",
 server = "ServerName\\\\SQL2016", output_database = "drugbank")

## save 'drugs' dataframe to DB
 parse_drug(TRUE)

## save 'drug_groups' dataframe to DB
 parse_drug_groups(TRUE)

## save 'drug_targets_actions' dataframe to DB
 parse_drug_targets_actions(TRUE)

## finally close db connection 
 close_db()

Exploring the data

Following is an example involving a quick look at a few aspects of the parsed data. First we look at the proportions of biotech and small-molecule drugs in the data.

## view proportions of the different drug types (biotech vs. small molecule)
drugs %>% 
    select(type) %>% 
    ggplot(aes(x = type, fill = type)) + 
    geom_bar() + 
    guides(fill=FALSE)     ## removes legend for the bar colors

Below, we view the different drug_groups in the data and how prevalent they are.

## view proportions of the different drug types for each drug group
drugs %>% 
    full_join(drug_groups, by = c('primary_key' = 'drugbank_id')) %>% 
    select(type, group) %>% 
    ggplot(aes(x = group, fill = type)) + 
    geom_bar() + 
    theme(legend.position= 'bottom') + 
    labs(x = 'Drug Group', 
         y = 'Quantity', 
         title="Drug Type Distribution per Drug Group", 
         caption="created by ggplot") + 
    coord_flip()

Finally, we look at the drug_targets_actions to observe their proportions as well.

## get counts of the different target actions in the data
targetActionCounts <- 
    drug_targets_actions %>% 
    group_by(action) %>% 
    summarise(count = n()) %>% 
    arrange(desc(count))

## get bar chart of the 10 most occurring target actions in the data
p <- 
    ggplot(targetActionCounts[1:10,], 
           aes(x = reorder(action,count), y = count, fill = letters[1:10])) + 
    geom_bar(stat = 'identity') +
    labs(fill = 'action', 
         x = 'Target Action', 
         y = 'Quantity', 
         title = 'Target Actions Distribution', 
         subtitle = 'Distribution of Target Actions in the Data',
         caption = 'created by ggplot') + 
    guides(fill = FALSE) +    ## removes legend for the bar colors
    coord_flip()              ## switches the X and Y axes

## display plot
p

Copy Link

Version

Install

install.packages('dbparser')

Monthly Downloads

735

Version

1.0.4

License

MIT + file LICENSE

Maintainer

Mohammed Ali

Last Published

August 28th, 2019

Functions in dbparser (1.0.4)

dbparser

dbparser: A package for reading and parsing drug bank xml with the option to save it in a given db.
get_xml_db_rows

Reads drug bank xml database and set it in memory.
open_mdb

Establish connection to given Maria database
get_drugbank_metadata

Return uploaded drugbank database metadata
close_db

Close open drug bank sql database
parse_drug

Extracts the main drug elements and return data as tibble.
open_db

Establish connection to given data base
get_drugbank_exported_date

Return uploaded drugbank database exported date
get_drugbank_version

Return uploaded drugbank database version
parse_drug_affected_organisms

Extracts the drug affected organisms element and return data as tibble.
parse_drug_books

Extracts the drug books element and return data as tibble.
parse_drug_categories

Extracts the drug categories element and return data as tibble.
parse_drug_classification

Extracts the drug classifications element and return data as data frame.
parse_drug_carriers_textbooks

Extracts the drug carriers textbooks element and return data as data frame.
parse_drug_atc_codes

Extracts the drug atc codes element and return data as data frame.
parse_drug_enzymes_polypeptides

Extracts the drug enzymes polypeptides element and return data as data frame.
parse_drug_enzymes_links

Extracts the drug enzymes links element and return data as data frame.
parse_drug_carriers_polypeptides_synonyms

Extracts the drug carriers polypeptides synonyms element and return data as data frame.
parse_drug_articles

Extracts the drug articles element and return data as tibble.
parse_drug_enzymes_actions

Extracts the drug enzymes actions element and return data as data frame.
parse_drug_enzymes_articles

Extracts the drug enzymes articles element and return data as data frame.
parse_drug_carriers

Extracts the drug carriers element and return data as data frame.
parse_drug_ahfs_codes

Extracts the drug ahfs codes element and return data as tibble.
parse_drug_links

Extracts the drug links element and return data as tibble.
parse_drug_carriers_polypeptides_go_classifiers

Extracts the drug carriers polypeptides go classifiers element and return data as data frame.
parse_drug_all

Extracts the all drug elements and return data as list of dataframes.
parse_drug_carriers_actions

Extracts the drug carriers actions element and return data as data frame.
parse_drug_carriers_articles

Extracts the drug carriers articles element and return data as data frame.
parse_drug_international_brands

Extracts the drug international brands and return data as tibble.
parse_drug_reactions

Extracts the drug reactions element and return data as data frame.
parse_drug_interactions

Extracts the drug interactions element and return data as tibble.
parse_drug_products

Extracts the drug products element and return data as tibble.
parse_drug_experimental_properties

Extracts the drug experimental properties element and return data as tibble.
parse_drug_enzymes_textbooks

Extracts the drug enzymes textbooks element and return data as data frame.
parse_drug_transporters

Extracts the drug transporters element and return data as data frame.
parse_drug_snp_adverse_drug_reactions

Extracts the drug snp adverse drug reactions element and return data as tibble.
parse_drug_targets_polypeptides_go_classifiers

Extracts the drug targets polypeptides go classifiers element and return data as data frame.
parse_drug_sequences

Extracts the drug sequences element and return data as data frame.
parse_drug_dosages

Extracts the drug dosages element and return data as tibble.
parse_drug_element

Extracts the given drug elements and return data as list of dataframes.
parse_drug_calculated_properties

Extracts the drug calculated properties element and return data as tibble.
parse_drug_element_options

Returns parse_drug_element valid options.
parse_drug_carriers_polypeptides_pfams

Extracts the drug carriers polypeptides pfams element and return data as data frame.
parse_drug_enzymes_polypeptides_pfams

Extracts the drug enzymes polypeptides pfams element and return data as data frame.
parse_drug_transporters_actions

Extracts the drug transporters actions element and return data as data frame.
parse_drug_transporters_polypeptides_go_classifiers

Extracts the drug transporters polypeptides go classifiers element and return data as data frame.
parse_drug_transporters_polypeptides_synonyms

Extracts the drug transporters polypeptides synonyms element and return data as data frame.
parse_drug_targets_polypeptides_pfams

Extracts the drug targets polypeptides pfams element and return data as data frame.
parse_drug_manufacturers

Extracts the drug manufacturers element and return data as data frame.
parse_drug_carriers_polypeptides

Extracts the drug carriers polypeptides element and return data as data frame.
parse_drug_carriers_links

Extracts the drug carriers links element and return data as data frame.
parse_drug_transporters_textbooks

Extracts the drug transporters textbooks element and return data as data frame.
parse_drug_transporters_polypeptides_pfams

Extracts the drug transporters polypeptides pfams element and return data as data frame.
parse_drug_pathway_drugs

Extracts the drug pathway drugs element and return data as data frame.
parse_drug_enzymes

Extracts the drug enzymes element and return data as data frame.
parse_drug_pathway_enzyme

Extracts the drug pathway enzyme element and return data as data frame.
parse_drug_food_interactions

Extracts the drug food interactions element and return data as tibble.
parse_drug_groups

Extracts the drug groups element and return data as tibble.
parse_drug_enzymes_polypeptides_synonyms

Extracts the drug enzymes polypeptides synonyms element and return data as data frame.
parse_drug_mixtures

Extracts the drug mixtures element and return data as tibble.
parse_drug_enzymes_polypeptides_go_classifiers

Extracts the drug groups element and return data as data frame.
parse_drug_external_identifiers

Extracts the drug external identifiers element and return data as tibble.
parse_drug_external_links

Extracts the drug external links element and return data as tibble.
parse_drug_enzymes_polypeptides_external_identifiers

Extracts the drug enzymes polypeptides external identifiers element and return data as data frame.
parse_drug_carriers_polypeptides_external_identifiers

Extracts the drug carriers polypeptides external identifiers element and return data as data frame.
parse_drug_patents

Extracts the drug patents element and return data as tibble.
parse_drug_pdb_entries

Extracts the drug pdb entries element and return data as tibble.
parse_drug_prices

Extracts the drug prices element and return data as data frame.
parse_drug_pathway

Extracts the drug pathway element and return data as data frame.
parse_drug_snp_effects

Extracts the drug snp effects element and return data as tibble.
parse_drug_packagers

Extracts the drug packagers element and return data as tibble.
parse_drug_targets_articles

Extracts the drug targets articles element and return data as data frame.
parse_drug_targets_polypeptides_external_identifiers

Extracts the drug targets polypeptides external identifiers element and return data as data frame.
parse_drug_targets_polypeptides

Extracts the drug targets polypeptides element and return data as data frame.
parse_drug_synonyms

Extracts the drug synonyms element and return data as tibble.
parse_drug_reactions_enzymes

Extracts the drug reactions enzymes element and return data as data frame.
parse_drug_targets

Extracts the drug targets element and return data as data frame.
parse_drug_targets_polypeptides_synonyms

Extracts the drug targets polypeptides synonyms element and return data as data frame.
parse_drug_targets_links

Extracts the drug targets links element and return data as data frame.
parse_drug_targets_textbooks

Extracts the drug targets textbooks element and return data as data frame.
parse_drug_transporters_articles

Extracts the drug transporters articles element and return data as data frame.
parse_drug_salts

Extracts the drug salts and return data as tibble.
parse_drug_transporters_links

Extracts the drug transporters links element and return data as data frame.
parse_drug_targets_actions

Extracts the drug targets actions element and return data as data frame.
parse_drug_transporters_polypeptides

Extracts the drug transporters polypeptides element and return data as data frame.
parse_drug_transporters_polypeptides_external_identifiers

Extracts the drug transporters polypeptides external identifiers element and return data as data frame.