dbparser v1.0.3

0

Monthly downloads

0th

Percentile

'DrugBank' Database XML Parser

This tool is for parsing the 'DrugBank' XML database <http://drugbank.ca/>. The parsed data are then returned in a proper 'R' dataframe with the ability to save them in a given database.

Readme

dbparser

Build
Status CRAN\_Status\_Badge DOWNLOADSTOTAL Rdoc

The main purpose of the dbparser package is to parse the DrugBank database which is downloadable in XML format from this link. The parsed data can then be explored and analyzed as desired by the user. The dbparser package further provides the facility of saving the parsed data into a given database.

Installation

You can install the released version of dbparser from CRAN with:

install.packages("dbparser")

or you can install the latest updates directly from the repo

library(devtools)
devtools::install_github("Dainanahan/dbparser")

Example

This is a basic example which shows you how to solve a common problem:

## parse data from XML and save it to memory
get_xml_db_rows(
              system.file("extdata", "drugbank_record.xml", package = "dbparser")
            )

## load drugs data
drugs <- parse_drug()

## load drug groups data
drug_groups <- parse_drug_groups()

## load drug targets actions data
drug_targets_actions <- parse_drug_targets_actions()

Saving into a database

The parsed data may be saved into a given database. Databases supported by dbparser include MS SQL Server, MySQL and any database supported by DBI package. Following is an example of saving the parsed data into a MySQL database.

library(dbparser)

## open a connection to the desired database engine with an already
## created database
 open_db(xml_db_name =  "drugbank.xml", driver = "SQL Server",
 server = "ServerName\\\\SQL2016", output_database = "drugbank")

## save 'drugs' dataframe to DB
 parse_drug(TRUE)

## save 'drug_groups' dataframe to DB
 parse_drug_groups(TRUE)

## save 'drug_targets_actions' dataframe to DB
 parse_drug_targets_actions(TRUE)

## finally close db connection 
 close_db()

Exploring the data

Following is an example involving a quick look at a few aspects of the parsed data. First we look at the proportions of biotech and small-molecule drugs in the data.

## view proportions of the different drug types (biotech vs. small molecule)
drugs %>% 
    select(type) %>% 
    ggplot(aes(x = type)) + 
    geom_bar() + 
    guides(fill=FALSE)     ## removes legend for the bar colors

Below, we view the different drug_groups in the data and how prevalent they are.

## view proportions of the different drug types for each drug group
drugs %>% 
    rename(parent_key = primary_key) %>% 
    full_join(drug_groups, by = 'parent_key') %>% 
    select(type, text) %>% 
    ggplot(aes(x = text, fill = type)) + 
    geom_bar() + 
    theme(legend.position= 'bottom') + 
    labs(x = 'Drug Group', 
         y = 'Quantity', 
         title="Drug Type Distribution per Drug Group", 
         caption="created by ggplot") + 
    coord_flip()

Finally, we look at the drug_targets_actions to observe their proportions as well.

## get counts of the different target actions in the data
targetActionCounts <- 
    drug_targets_actions %>% 
    group_by(text) %>% 
    summarise(count = n()) %>% 
    arrange(desc(count))

## get bar chart of the 10 most occurring target actions in the data
p <- 
    ggplot(targetActionCounts[1:10,], 
           aes(x = reorder(text,count), y = count, fill = letters[1:10])) + 
    geom_bar(stat = 'identity') +
    labs(fill = 'action', 
         x = 'Target Action', 
         y = 'Quantity', 
         title = 'Target Actions Distribution', 
         subtitle = 'Distribution of Target Actions in the Data',
         caption = 'created by ggplot') + 
    guides(fill = FALSE) +    ## removes legend for the bar colors
    coord_flip()              ## switches the X and Y axes

## display plot
p

Functions in dbparser

Name Description
dbparser dbparser: A package for reading and parsing drug bank xml with the option to save it in a given db.
get_drugbank_metadata Return uploaded drugbank database metadata
get_drugbank_exported_date Return uploaded drugbank database exported date
close_db Close open drug bank sql database
parse_drug_ahfs_codes Extracts the drug ahfs codes element and return data as tibble.
parse_drug_affected_organisms Extracts the drug affected organisms element and return data as tibble.
get_drugbank_version Return uploaded drugbank database version
parse_drug_calculated_properties Extracts the drug calculated properties element and return data as tibble.
parse_drug_carriers Extracts the drug carriers element and return data as data frame.
parse_drug_atc_codes Extracts the drug atc codes element and return data as data frame.
parse_drug_books Extracts the drug books element and return data as tibble.
get_xml_db_rows Reads drug bank xml database and set it in memory.
parse_drug_carriers_links Extracts the drug carriers links element and return data as data frame.
parse_drug_carriers_polypeptides Extracts the drug carriers polypeptides element and return data as data frame.
parse_drug_carriers_actions Extracts the drug carriers actions element and return data as data frame.
parse_drug_carriers_articles Extracts the drug carriers articles element and return data as data frame.
parse_drug_dosages Extracts the drug dosages element and return data as tibble.
parse_drug_classification Extracts the drug classifications element and return data as data frame.
parse_drug Extracts the main drug elements and return data as tibble.
parse_drug_carriers_textbooks Extracts the drug carriers textbooks element and return data as data frame.
open_db Establish connection to given data base
parse_drug_carriers_polypeptides_synonyms Extracts the drug carriers polypeptides synonyms element and return data as data frame.
parse_drug_carriers_polypeptides_pfams Extracts the drug carriers polypeptides pfams element and return data as data frame.
parse_drug_all Extracts the all drug elements and return data as list of dataframes.
parse_drug_categories Extracts the drug categories element and return data as tibble.
parse_drug_carriers_polypeptides_external_identifiers Extracts the drug carriers polypeptides external identifiers element and return data as data frame.
parse_drug_carriers_polypeptides_go_classifiers Extracts the drug carriers polypeptides go classifiers element and return data as data frame.
parse_drug_articles Extracts the drug articles element and return data as tibble.
parse_drug_element_options Returns parse_drug_element valid options.
parse_drug_element Extracts the given drug elements and return data as list of dataframes.
parse_drug_enzymes_polypeptides Extracts the drug enzymes polypeptides element and return data as data frame.
parse_drug_enzymes_polypeptides_external_identifiers Extracts the drug enzymes polypeptides external identifiers element and return data as data frame.
parse_drug_enzymes_actions Extracts the drug enzymes actions element and return data as data frame.
parse_drug_enzymes Extracts the drug enzymes element and return data as data frame.
parse_drug_enzymes_polypeptides_pfams Extracts the drug enzymes polypeptides pfams element and return data as data frame.
parse_drug_enzymes_polypeptides_go_classifiers Extracts the drug groups element and return data as data frame.
parse_drug_enzymes_textbooks Extracts the drug enzymes textbooks element and return data as data frame.
parse_drug_enzymes_polypeptides_synonyms Extracts the drug enzymes polypeptides synonyms element and return data as data frame.
parse_drug_enzymes_articles Extracts the drug enzymes articles element and return data as data frame.
parse_drug_enzymes_links Extracts the drug enzymes links element and return data as data frame.
parse_drug_food_interactions Extracts the drug food interactions element and return data as tibble.
parse_drug_external_links Extracts the drug external links element and return data as tibble.
parse_drug_links Extracts the drug links element and return data as tibble.
parse_drug_international_brands Extracts the drug international brands and return data as tibble.
parse_drug_pathway_drugs Extracts the drug pathway drugs element and return data as data frame.
parse_drug_pathway Extracts the drug pathway element and return data as data frame.
parse_drug_reactions Extracts the drug reactions element and return data as data frame.
parse_drug_manufacturers Extracts the drug manufacturers element and return data as data frame.
parse_drug_external_identifiers Extracts the drug external identifiers element and return data as tibble.
parse_drug_mixtures Extracts the drug mixtures element and return data as tibble.
parse_drug_reactions_enzymes Extracts the drug reactions enzymes element and return data as data frame.
parse_drug_experimental_properties Extracts the drug experimental properties element and return data as tibble.
parse_drug_prices Extracts the drug prices element and return data as data frame.
parse_drug_products Extracts the drug products element and return data as tibble.
parse_drug_packagers Extracts the drug packagers element and return data as tibble.
parse_drug_patents Extracts the drug patents element and return data as tibble.
parse_drug_targets_articles Extracts the drug targets articles element and return data as data frame.
parse_drug_targets_actions Extracts the drug targets actions element and return data as data frame.
parse_drug_transporters_actions Extracts the drug transporters actions element and return data as data frame.
parse_drug_transporters_articles Extracts the drug transporters articles element and return data as data frame.
parse_drug_targets_polypeptides_go_classifiers Extracts the drug targets polypeptides go classifiers element and return data as data frame.
parse_drug_sequences Extracts the drug sequences element and return data as data frame.
parse_drug_targets_polypeptides_external_identifiers Extracts the drug targets polypeptides external identifiers element and return data as data frame.
parse_drug_salts Extracts the drug salts and return data as tibble.
parse_drug_transporters_links Extracts the drug transporters links element and return data as data frame.
parse_drug_transporters_polypeptides_pfams Extracts the drug transporters polypeptides pfams element and return data as data frame.
parse_drug_transporters_polypeptides Extracts the drug transporters polypeptides element and return data as data frame.
parse_drug_transporters_polypeptides_synonyms Extracts the drug transporters polypeptides synonyms element and return data as data frame.
parse_drug_targets_links Extracts the drug targets links element and return data as data frame.
parse_drug_groups Extracts the drug groups element and return data as tibble.
parse_drug_synonyms Extracts the drug synonyms element and return data as tibble.
parse_drug_transporters_polypeptides_external_identifiers Extracts the drug transporters polypeptides external identifiers element and return data as data frame.
parse_drug_pathway_enzyme Extracts the drug pathway enzyme element and return data as data frame.
parse_drug_interactions Extracts the drug interactions element and return data as tibble.
parse_drug_pdb_entries Extracts the drug pdb entries element and return data as tibble.
parse_drug_transporters_polypeptides_go_classifiers Extracts the drug transporters polypeptides go classifiers element and return data as data frame.
parse_drug_transporters_textbooks Extracts the drug transporters textbooks element and return data as data frame.
parse_drug_targets Extracts the drug targets element and return data as data frame.
parse_drug_snp_adverse_drug_reactions Extracts the drug snp adverse drug reactions element and return data as tibble.
parse_drug_targets_textbooks Extracts the drug targets textbooks element and return data as data frame.
parse_drug_snp_effects Extracts the drug snp effects element and return data as tibble.
parse_drug_transporters Extracts the drug transporters element and return data as data frame.
parse_drug_targets_polypeptides_pfams Extracts the drug targets polypeptides pfams element and return data as data frame.
parse_drug_targets_polypeptides_synonyms Extracts the drug targets polypeptides synonyms element and return data as data frame.
parse_drug_targets_polypeptides Extracts the drug targets polypeptides element and return data as data frame.
No Results!

Vignettes of dbparser

Name
dbparser.Rmd
fig1.png
fig2.png
fig3.png
No Results!

Last month downloads

Details

License MIT + file LICENSE
Encoding UTF-8
LazyData true
RoxygenNote 6.1.1
VignetteBuilder knitr
URL https://dainanahan.github.io/dbparser/index.html
BugReports https://github.com/Dainanahan/dbparser/issues
NeedsCompilation no
Packaged 2019-07-11 15:41:06 UTC; Mohammed
Repository CRAN
Date/Publication 2019-07-11 22:21:28 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/dbparser)](http://www.rdocumentation.org/packages/dbparser)