Paper published: semantic deep learning

Published on 22 November 2019

Image linking animals, humans and environment to show One Health

Our latest paper led by Dr Mercedes Arguello-Casteleiro at University of Manchester has been published. This work explores how to turn information about diagnoses, prognoses, therapies and other clinical concepts into computable knowledge using free-text data about human and animal health

Deep Learning opens up opportunities for routinely scanning large bodies of biomedical literature and clinical narratives to represent the meaning of biomedical and clinical terms. However, the validation and integration of this knowledge on a scale requires cross checking with ground truths (i.e. evidence-based resources) that are unavailable in an actionable or computable form.
We used a Semantic Deep Learning approach that combines the Semantic Web technologies and Deep Learning to acquire and validate knowledge about 11 well-known medical conditions mined from two sets of unstructured free-text data: 300 K PubMed Systematic Review articles (the PMSB dataset) and 2.5 M veterinary clinical notes (the VetCN dataset).
For each target condition we obtained 20 related clinical concepts using two deep learning methods applied separately on the two datasets, resulting in 880 term pairs (target term, candidate term). Each concept, represented by an n-gram, is mapped to UMLS using MetaMap; we also developed a bespoke method for mapping short forms (e.g. abbreviations and acronyms).
Existing ontologies were used to formally represent associations. We also create ontological modules and illustrate how the extracted knowledge can be queried. The evaluation was performed using the content within BMJ Best Practice.
MetaMap achieves an F measure of 88% (precision 85%, recall 91%) when applied directly to the total of 613 unique candidate terms for the 880 term pairs. When the processing of short forms is included, MetaMap achieves an F measure of 94% (precision 92%, recall 96%). Validation of the term pairs with BMJ Best Practice yields precision between 98 and 99%.
The Semantic Deep Learning approach can transform neural embeddings built from unstructured free-text data into reliable and reusable One Health knowledge using ontologies and content from BMJ Best Practice.

Access the paper here

Read the infographic summarising this work here