Overview
An exciting opportunity at the intersection of plant science, bioinformatics and artificial intelligence
About this opportunity
The Challenge
How will we feed 10 billion people by 2050 under a changing climate? This fundamental challenge requires innovative approaches to crop improvement. This interdisciplinary PhD project offers a unique opportunity to develop cutting-edge AI tools that could revolutionise how we identify beneficial genetic variants in crops, accelerating the discovery of traits for resilience and productivity.
Genomic technologies have transformed plant breeding. Massive initiatives have sequenced over 1,000 Arabidopsis and 3,000 rice genomes, identifying thousands of genetic variants associated with traits like drought tolerance or disease resistance. However, a critical bottleneck remains: we lack efficient tools to prioritise which variants genuinely affect protein function without expensive, time-consuming experiments.
History demonstrates the transformative potential of understanding protein variants. The Green Revolution that dramatically increased wheat yields in the 1960s-70s stemmed from mutations in DELLA genes, where single amino acid changes at sites of post-translational modifications (PTMs) created dwarf, high-yielding varieties. Despite such examples, plant research predominantly focuses on genomics rather than protein function—a gap this PhD will address.
Your Project
You will develop the first comprehensive AI-based tool for predicting functional effects of protein variants in plants. Drawing on extensive datasets from our BBSRC/NSF-funded PTMeXchange and PanOryza consortia, you will integrate PTM data, structural predictions, and trait-associated variants to create training datasets for machine learning models.
Your research will progress through four stages: assembling cross-plant PTM atlases and functional protein data (Year 1); developing and validating AI classifiers using traditional machine learning and cutting-edge transformer-based protein language models (Year 2); making testable predictions in key crop species and validating findings through collaborations (Year 3); and prototyping “ProtVar for plants” with our EMBL-EBI partners (Year 4). You will also undertake an industry placement to gain diverse experience.
Training and Environment
You will receive comprehensive training in protein bioinformatics, data science, and AI methods—highly sought-after skills for careers in academia and industry, as part of the BBSRC funded Liverpool-Manchester “NWD” doctoral partnership . You will join thriving bioinformatics teams supported by Wellcome Trust and BBSRC funding. A placement at EMBL-EBI will provide invaluable experience in developing production-grade software for the global research community, working alongside the team behind UniProtKB, the world’s leading protein database.
Impact
Your work will create open-source tools accessible to plant scientists worldwide, accelerating trait discovery across major crops. By bridging the gap between genomics and protein function, you will contribute directly to global food security whilst developing expertise at the forefront of AI and computational biology.