The problem
Even microbes that are usually harmless can evolve to become more dangerous, potentially triggering new outbreaks or even global pandemics. On top of that, antimicrobial resistance (AMR) is already a microbial public health crisis: bacteria are becoming less responsive to the drugs we rely on to treat infections, and this resistance now contributes to nearly five million deaths every year worldwide. My research aims to support public health efforts by using the genetic information of microbes to improve how we prescribe and use antibiotics. By combining microbial genomic data with electronic health records, we can help clinicians choose more effective treatments and take better precautions in hospitals, particularly in cases where resistant or highly transmissible bacteria are involved.
Today, we have access to the complete genetic sequences of over two million bacteria, thanks to samples collected from scientific research, hospital patients, agricultural systems, and the environment. Analysing such vast datasets is a major challenge, demanding scalable and efficient computational tools. At the same time, each step in the analysis introduces some level of error or uncertainty, which must be carefully accounted for if we want to draw reliable conclusions to inform healthcare decisions. Only now we are realising that knowing which specific bacterial strain infects a patient is not enough to predict if it will respond to the treatment. This is because some bacterial genes can evolve or move between microbes, and because other animals or sources can act as reservoirs, where such genetic exchange can take place.
The solution
To address these challenges, I am using probabilistic models—specifically, Bayesian approaches—to describe how certain genes evolve over time and interact with each other. These models help us describe the origin and histories of genes likely to be linked to resistance, transmission, or virulence. By associating these outcomes to probabilities, we can work with several possible scenarios at once. Such models are called phylogenomic since they use all the genetic information available (the genomes) to describe the evolution of the microbes, from the past to the present (the phylogenies). With such phylogenomic scenarios we can ask different questions, that is we can look at different summary views using their outputs.
Solution to other problems
Besides helping to decide on antibiotic treatments, these models can also be used by the agrifood industry, wastewater treatment facilities, and by veterinarians, amongst others, by focussing not on the pathogens, but the genetic information carried by all microbes. Under the One Health approach, the microbes should not be studied in isolation but in connection to each other, the environment, and their hosts if we want to understand how they can affect us. Therefore, these are general models, easily translatable to other problems, even though summarising their output is task-specific (a pet veterinarian might not be interested in the same questions as a farmer).
I have a background in computational evolutionary biology, particularly in designing Bayesian models to understand how organisms change over time, as well as in microbial bioinformatics―which focuses on analysing computationally the genetic material of microorganisms. During the COVID-19 pandemic, I was part of the efforts uncovering the evolutionary paths taken by the SARS-CoV-2 virus. One challenge was the deluge of genomic data, which we are now seeing for other microbes. Another, was to translate our findings into actionable summaries, leading to the use of views, such as “viral lineage”, as a shortcut for describing the evolutionary history of the most prevalent strains.
For this case study, we may want to know which antibiotics should be used less often, in which case we restrict our view to which genes responsible for AMR are more common, and how easily they are exchanged between bacteria. Another view is to look at AMR genes which are unlikely to become prevalent in a particular setting, such as a hospital ward, and by asking this question to the data, we will find antibiotics which are more promising in such settings.
Keywords: Public Health, Bayesian Phylogenetics, Electronic Health Record, Microbiome, Genomics
Back to: Centre for Doctoral Training in Distributed Algorithms