Can we use machine learning to predict viral spill-overs and the next pandemic virus?


Two-thirds of emerging human disease are zoonotic, caused by pathogens which spillover from animals into human populations. These include the entire list of WHO priority diseases including Lassa, Zika, and SARS-CoV-2. This pattern is mirrored for animal viruses, such as Schmallenberg and Usutu, both of which recently emerged from and currently circulate in wild animals. More concerning is that viruses of public and veterinary health concern are emerging increasingly frequently, now, every 2-3 years, and if left unchecked, will likely to continue to escalate.

Early detection of viruses with potential to cause health concerns is, therefore, of utmost importance. However, whilst most viral spillover events are dead-ends in which the virus is unable to persist in the new host species, some, as above, can result in global emergencies. The most pressing questions are: which viruses and why? - You will develop tools to enable scientists and policy makers to answer these questions.

Part 1 - What determines the route of virus transmission?

Viruses can have a range of transmission routes between their hosts. These routes often differ when a virus is transmitted within their ‘natural’ reservoir populations, compared to when they are transmitted to new species. Which routes can be utilised by a virus plays an important role in whether emerging viruses can achieve sustained transmission within the population of the novel host species. Therefore, the successful candidate will first develop machine-learning tools to uncover the virus and host characteristics which determine transmission routes of viruses to their hosts.

Part 2 - Which host and virus characteristics are associated with a) sustained transmission within a species, and b) the potential to spill-over into new species?

In collaboration with our partner group in Newcastle University, you will investigate which host characteristics are most important to establish transmission, and how to transform virus sequences into “features” – variables from which machine-learning algorithms can learn. You will also investigate how to integrate networks linking viruses and their hosts into machine-learning frameworks.

You will then extend those tools to establish which virus characteristics are associated with sustained within-species transmission, and those that lead to dead-end transmission.

Part 3 - Application to new and unstudied viruses

Finally, you will apply the tools you develop to predict transmission potential and mechanism of novel viruses, viruses without known host species, and viruses without known transmission routes.

The ultimate goal is to be able to predict how a virus can be transmitted from just a sequenced environmental sample – in effect, the earliest possible warning system for identifying potential new viruses of concern.


As part of the project you will gain valuable training in data and network-science, bioinformatics, and machine-learning, all of which are invaluable skills that are increasingly important in combating emerging diseases. You will also gain experience in data-intensive research, and will have opportunities to communicate your work to policy-makers and the general public.


Applications should be made by emailing  with:

·        a CV (including contact details of at least two academic (or other relevant) referees);

·        a covering letter – clearly stating your first choice project, and optionally 2nd ranked project, as well as including whatever additional information you feel is pertinent to your application; you may wish to indicate, for example, why you are particularly interested in the selected project(s) and at the selected University;

·        copies of your relevant undergraduate degree transcripts and certificates;

·        a copy of your IELTS or TOEFL English language certificate (where required);

·        a copy of your passport (photo page).

A GUIDE TO THE FORMAT REQUIRED FOR THE APPLICATION DOCUMENTS IS AVAILABLE AT Applications not meeting these criteria may be rejected.

In addition to the above items, please email a completed copy of the Additional Details Form (as a Word document) to . A blank copy of this form can be found at:

Informal enquiries may be made to 

The deadline for all applications is 12noon on Monday 9th January 2023.





Open to students worldwide

Funding information

Funded studentship

Studentships are funded by the Biotechnology and Biological Sciences Research Council (BBSRC) for 4 years. Funding will cover tuition fees at the UK rate only, a Research Training and Support Grant (RTSG) and stipend. We aim to support the most outstanding applicants from outside the UK and are able to offer a limited number of bursaries that will enable full studentships to be awarded to international applicants. These full studentships will only be awarded to exceptional quality candidates, due to the competitive nature of this scheme.



Monkeypox virus shows potential to infect a diverse range of native animal species across Europe, indicating high risk of becoming endemic in the region”. bioRxiv 2022.08.13.503846 (2022)
Predicting mammalian hosts in which novel coronaviruses can be generated. Nat. Commun. 2021 121 12, 1–12 (2021)
Divide-and-conquer: machine-learning integrates mammalian and viral traits with network features to predict virus-mammal associations. Nat. Commun. 2021 121 12, 1–15 (2021)
Integration of shared-pathogen networks and machine learning reveals the key aspects of zoonoses and predicts mammalian reservoirs. Proc. R. Soc. B Biol. Sci. 287, (2020)
Database of host-pathogen and related species interactions, and their global distribution. Sci. Data 2, (2015).