Advancing Untargeted Metabolomics through a Probabilistic and Context-Aware Annotation Pipeline

Overview

This project aims to develop a revolutionary AI-powered, "context-aware" pipeline to automate metabolite annotation in untargeted metabolomics data. By pioneering LLMs and Bayesian statistics, this project will transform complex metabolomics data into biological breakthroughs.

About this opportunity

Metabolite annotation is one of the most pressing challenges in untargeted metabolomics data analysis. Current annotation tools often rely on simple mass-matching against static databases, leading to high false-positive rates. This project builds upon the Integrated Probabilistic Annotation (IPA) framework (Del Carratore et al., 2019; Del Carratore et al., 2023) to move beyond simple matching toward a “context-aware” system.

The PhD candidate will lead three key objectives:

AI-Driven Database Curation: You will utilize Large Language Models (LLMs) to mine scientific literature and existing repositories to create a “context-aware” database. This database will encode vital metadata like retention times and biological likelihood to filter out false positives.
Platform Expansion: You will extend the computational framework to integrate data from emerging analytical platforms beyond LC-MS, including GC-MS, Ion Mobility-MS, and MALDI.
Software Engineering & GUI: To ensure global community adoption, you will develop a user-friendly Graphical User Interface (GUI), empowering non-bioinformaticians to utilize these advanced probabilistic methods.

Training and Collaboration

You will be embedded in Dr Del Carratore Lab focusing on Bioinformatics and Computational Biology. Moreover, you will closely collaborate with two a world-class research facility at the University of Liverpool, benefiting from a unique dual academic setting:

Computational Biology Facility (CBF): You will work within the CBF to develop high-quality code and AI models, gaining expertise in software engineering and LLM implementation.
Centre for Metabolomics Research (CMR): You will have direct access to data coming from state-of-the-art analytical platforms to generate and validate experimental data. Prof. Warwick Dunn will provide mentorship on analytical chemistry aspects and user requirements.

Project Structure

The 3.5-year PhD is designed to transition you from a trainee to an independent leader in computational biology:

Year 1: Foundation and Advanced Training. Your first year focuses on mastering the computational skillsets required for the project, including bioinformatics, Bayesian statistics, and AI/LLM implementation. You will begin the initial development of the “context-aware” database by mining existing repositories.
Years 2-3: Implementation and Engagement. During this period, you will move into independent research, expanding the IPA framework to new analytical technologies like Ion Mobility-MS. You will also lead the development of the software GUI and present your findings at major international conferences, such as the annual meeting of the Metabolomics Society.
Final Phase: Thesis and Independent Research. The final six months are dedicated to completing your independent research, finalizing the open-source software for community release on GitHub, and writing your doctoral thesis

Who is this for?

This degree is designed for ambitious graduates holding a first-class or high 2:1 honors degree (or an equivalent international qualification) in a quantitative or life science discipline, such as Bioinformatics, Computer Science, Engineering, Biochemistry, or Systems Biology. The ideal candidate will possess a strong foundation in computational programming—particularly in Python—and a passionate interest in applying AI, Large Language Models (LLMs), and Bayesian statistics to solve complex biological challenges.

How to apply

1. Contact supervisors

Supervisors	Email address	Staff profile URL
Dr Francesco Del Carratore	Francesco.del-carratore@liverpool.ac.uk	https://www.liverpool.ac.uk/people/francesco-del-carratore
Professor Warwick Dunn	Warwick.Dunn@liverpool.ac.uk	https://www.liverpool.ac.uk/people/warwick-dunn
Professor Andy Jones	Andrew.Jones@liverpool.ac.uk	https://www.liverpool.ac.uk/people/andrew-jones

To express your interest in this project, please email your CV, a cover letter outlining your suitability to the primary supervisor, Dr. Francesco Del Carratore, at Francesco.del-carratore@liverpool.ac.uk.

Following an initial review of these informal applications, shortlisted candidates will be invited for an interview and guided through the formal submission process via the University of Liverpool Application Portal.

2. Prepare your application documents

You may need the following documents to complete your online application:
- A research proposal (this should cover the research you’d like to undertake)
- University transcripts and degree certificates to date
- Passport details (international applicants only)
- English language certificates (international applicants only)
- A personal statement
- A curriculum vitae (CV)
- Contact details for two proposed supervisors
- Names and contact details of two referees.
3. Apply

Finally, register and apply online. You'll receive an email acknowledgment once you've submitted your application. We'll be in touch with further details about what happens next.

Funding your PhD

This project is fully funded, covering tuition fees, bench fees and living expenses.

Contact us

Have a question about this research opportunity or studying a PhD with us? Please get in touch with us, using the contact details below, and we’ll be happy to assist you.

Dr Francesco Del-Carratore

Postgraduate Online Open Event