Using Deep Learning-based tools to better understand the interactions of viral proteins with host cells


Diseases caused by viruses are of major medical and veterinary importance yet recent experience with Covid-19 demonstrates the extreme practical importance of being able to predict the host-virus specificity and of a full understanding of the roles of viral protein products. This project employs state-of the-art methods, including the Deep-Learning-based structure prediction software AlphaFold 2 (AF2) to address these key questions.

Current predictive methods of host-virus specificity are largely based on high level genomic features, not considering the atomic molecular interactions [1]. Our hypothesis is that explicit atomic-level modelling of the relevant molecular interactions will improve predictive performance further. This approach has not been explored at scale.

Many viruses contain ORFs with unknown transcription and function. SARS‑CoV‑2 exemplifies how such gene products can play important roles in manipulating host cell behaviour. The project therefore also encompasses the structure-based function annotation of putative viral accessory gene products, especially with DL-based modelling methods such as AF2. AF2 emerged in late 2020 [2] and has revolutionised protein structure prediction [3]. In favourable cases (most), structure-based function annotation can be applied to AF2 models as it would be to experimental structures. We will study the putative products of viral ORFs by first addressing a test case eg the poxviruses. Where ORFs are conserved across a range of viruses, an initial sign of their potential relevance, we will make models with AF2. We will first focus on confidently predicted folded domains, garnering initial function hypotheses through any structural resemblance shared with entries in the PDB or the AlphaFold Database. These will be elaborated and tested using bottom-up structure-based methods, including DL-based methods Pesto and DeepFRI, as well as electrostatic, sequence conservation and geometric analysis [4].

The AF2 derivative AlphaFold-Multimer has proven extraordinarily powerful in modelling complexes and an array of interface scores distinguish authentic from non-existent interactions. We will deploy it here for host-virus specificity prediction, using poxviruses as a first test case. The student will model a representative selection of virus-receptor interactions - sampling positive, negative and untested virus-receptor interactions – and will assess the ability of interface analyses to reproduce known patterns of interaction, using Machine Learning to combine different signals.

The student will receive extensive training in Deep Learning-based tools for structure and function prediction as well as a thorough grounding in more traditional tools for visualisation and analysis of protein structures and complexes. The primary supervisor has internationally recognised expertise in these areas (; 𝕏 @DanielRigden). The second and third supervisors ( 𝕏 @mishcka) have extensive track records in computational aspects of virology. Thus, overall, the project will train the student to utilise complementary skills (protein structural bioinformatics, computer science, integrative virology) and synthesise them into an integrated project. All supervisors enable a supportive, open and collaborative environment in their labs and fully subscribe to principles of EDI.

Benefits of being in the DiMeN DTP:

This project is part of the Discovery Medicine North Doctoral Training Partnership (DiMeN DTP), a diverse community of PhD students across the North of England researching the major health problems facing the world today. Our partner institutions (Universities of Leeds, Liverpool, Newcastle, York and Sheffield) are internationally recognised as centres of research excellence and can offer you access to state-of the-art facilities to deliver high impact research.

We are very proud of our student-centred ethos and committed to supporting you throughout your PhD. As part of the DTP, we offer bespoke training in key skills sought after in early career researchers, as well as opportunities to broaden your career horizons in a range of non-academic sectors.

Being funded by the MRC means you can access additional funding for research placements, international training opportunities or internships in science policy, science communication and beyond. See how our current DiMeN students have benefited from this funding here: 

Further information on the programme and how to apply can be found on our website: 


Open to students worldwide

Funding information

Funded studentship

Studentships are fully funded by the Medical Research Council (MRC) for 4yrs. Funding will cover tuition fees, stipend (£18,622 p.a. for 2023/24) and project costs. We also aim to support the most outstanding applicants from outside the UK and are able to offer a limited number of full studentships to international applicants. Please read additional guidance here: View Website
Studentships commence: 1st October 2024
Good luck!



  1. Wardeh, M., Baylis, M. & Blagrove, M.S.C. Predicting mammalian hosts in which novel coronaviruses can be generated. Nat Commun 12, 780 (2021).
  2. Jumper J, Evans R, Pritzel A. et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021. 596, 583–589
  3. Simpkin AJ, Mesdaghi S, Sánchez Rodríguez F, Elliott L, Murphy DL, Kryshtafovych A, Keegan RM, Rigden DJ. Tertiary structure assessment at CASP15. Proteins. 2023 doi: 10.1002/prot.26593.
  4. From Protein Structure to Function with Bioinformatics (2017) Rigden DJ (ed) Springer.