Learn R Programming

epos

Analysis and Visualization of statistical information derived from biomedical named entities that were automatically extracted with a UIMA-based text mining workflow on the corpus of BioASQ. The major scope of this R package is the comparison of drug names that co-occur with entities from epilepsy ontologies in the same documents.

Basically, the UIMA-based workflow takes as input dictionaries containing biomedical entities with synonyms for identifying them in documents of the BioASQ corpus. The epilepsy ontologies EpSO, ESSO, EPILONT, EPISEM and FENICS are used for creating epilepsy-related dictionaries. The current version of the DrugBank open data vocabulary is taken for creating a dictionary for drug names (https://go.drugbank.com/releases/latest#open-data).

The UIMA-based text mining workflow is described in the following three publications:

Müller B, Hagelstein A (2016) Beyond Metadata – Enriching Life Science Publications in LIVIVO with Semantic Entities from the Linked Data Cloud. In: Joint Proceedings of the Posters and Demos Track of the 12th International Conference on Semantic Systems – SEMANTiCS2016 and the 1st International Workshop on Semantic Change & Evolving Semantics SuCCESS’16, Leipzig, Germany doi:10.4126/FRL01-006408558

Müller B, Hagelstein A, Gübitz T (2016) Life Science Ontologies in Literature Retrieval: A Comparison of Linked Data Sets for Use on Semantic Search on a Heterogeneous Corpus. In: Proceedings of the 20th International Conference on Knowledge Engineering and Knowledge Management. Bologna, Italy doi:10.1007/978-3-319-58694-6_22

Müller B, Rebholz-Schuhmann D (2020) Selected Approaches Ranking Contextual Term for the BioASQ Multi-label Classification (Task6a and 7a). In: Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases ECML PKDD 2019, Würzburg, Germany doi:10.1007/978-3-030-43887-6_52

Müller, B., Castro, L.J. & Rebholz-Schuhmann, D. Ontology-based identification and prioritization of candidate drugs for epilepsy from literature. J Biomed Semant 13, 3 (2022). doi:10.1186/s13326-021-00258-w

Please cite this work as:

Bernd Müller. R-package for the Analysis and Visualization of Epilepsy Ontologies' Similarities According to Co-Occurring Drug Names in the 2021 BioASQ corpus. ZENODO, 10.5281/zenodo.4682869

Copy Link

Version

Install

install.packages('epos')

Monthly Downloads

187

Version

1.1

License

LGPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Bernd Mueller

Last Published

March 15th, 2024

Functions in epos (1.1)

drawVenn5

Create quintuple Venn Diagramm for overlapping concepts between EpSO, ESSO, EPILONT, EPISEM and FENICS
drawVenn4DrugDoc

Create quad Venn Diagramm for shared documents with co-occurrences of drug names between EpSO, ESSO, EPILONT and EPISEM
drawVenn4Syn

Create quad Venn Diagramm for shared synonyms between EpSO, ESSO, EPILONT and EPISEM
jaccard

Calculate jaccard similarity metric for two sets a and b
plotDSEA

Plotting functions for DSEA lists
printTop10Drugs

Print Top 10 Drugs
plotEnrichment

Plotting functions for enrichment lists
sortTableByRefMatches

Sort table by scoring for each row
filterNeuroDrugs

Filter a given list of drug names for having an ATC code starting with N indicating to be a drug for the Nervous System
doFullPlot

Does the full plot on one page
dice

Calculate dice similarity metric
genDictListFromRawFreq

Clears object that was loaded from harddrive into a list of terms sorted by frequency
drawVenn4Doc

Create quintuple Venn Diagramm for shared documents with co-occurrences of drug names between EpSO, ESSO, EPILONT and EPISEM
drawVenn4

Create quad Venn Diagramm for overlapping concepts between EpSO, ESSO, EPILONT and EPISEM
getRefAll

Retrieve the list of drugs from the union of all reference lists
createTanimotoBaseline

Creates the plot for all jaccard coefficients amongst the three epilepsy ontologies
drawVennGrid

Create plot_grid from multiple plots
rawDrugNamesCoOcESSO

List drug terms with their frequency co-occurring with terms from the ESSO ontology in publications since 2015 from the BioASQ 2020 corpus.
getTermMatrix

Receives a sorted hashmap with found entities from a dictionary
rawDrugNamesCoOcEpSO

List drug terms with their frequency co-occurring with terms from the EpSO ontology in publications since 2015 from the BioASQ 2020 corpus.
rawDrugNamesCoOcEPILONT

List drug terms with their frequency co-occurring with terms from the EPILONT ontology in publications since 2015 from the BioASQ 2020 corpus.
rawDrugNamesCoOcEPISEM

List drug terms with their frequency co-occurring with terms from the EPISEM ontology in publications since 2015 from the BioASQ 2020 corpus.
filterApprovedDrugs

Filter a given list of drug names for having an ATC code, if not they are dropped
drawVenn5Doc

Create quintuple Venn Diagramm for shared documents between EpSO, ESSO, EPILONT, EPISEM and FENICS
readAtcMapIntoHashMapDrugNamesAtcCodes

Processes the input file db-atc.map to form a HashMap containing the drug names with ATC codes
rawDrugNamesCoOcFENICS

List drug terms with their frequency co-occurring with terms from the FENICS ontology in publications from the BioASQ 2020 corpus.
drawVenn5DrugDoc

Create quintuple Venn Diagramm for shared documents with co-occurrences of drug names between EpSO, ESSO, EPILONT, EPISEM and FENICS
readSecondLevelATC

Read the second level ATC classes from the file atc-secondlevel.map
drawVenn5Syn

Create quintuple Venn Diagramm for shared synonyms between EpSO, ESSO, EPILONT, EPISEM and FENICS
readAtcMapIntoHashMapAtcCodesAtcNames

Processes the input file db-atc.map to form a HashMap containing the drug names with ATC codes
calcDice

Calculate the dice similarity metric for two lists a and b
calcCosine

Calculate the cosine similarity metric for two lists a and b
createBaseTable

Main function to call everything and produce the results
calcEnrichment

Calculate enrichment of one list in comparison to reference list
calcDSEA

Calculate dsea scores of one list in comparison to reference list
createJaccardPlotDBMeSH

Creates the plot for all jaccard coefficients amongst the three epilepsy ontologies
cosine

Calculate cosine similarity metric
createDashVectorForATC

Creates a vector with an X at each position where a drug from the druglist matches the ATC class list slatc
createJaccardPlotMeSHFive

Creates the plot for all jaccard coefficients amongst the three epilepsy ontologies
calcJaccard

Calculate the jaccard coefficient for two lists a and b
createNeuroTable

Create the final resulting data frame