Overview
Lung cancer is the leading cause of cancer-related death worldwide. We aim to develop and validate blood-borne diagnostic biomarkers to enable early detection to save lives. In Liverpool, we have collected a unique cohort in which matched proteomic, miRNA, and metabolomic profiles were generated from pre-diagnostic plasma samples collected 1–10 years before diagnosis. This offers an exciting PhD opportunity.
About this opportunity
Background
Lung cancer is the leading cause of cancer-related death worldwide. We aim to develop and validate blood-borne diagnostic biomarkers to enable early detection to save lives.
In Liverpool, we have collected a unique cohort in which matched proteomic, miRNA, and metabolomic profiles were generated from pre-diagnostic plasma samples collected 1–10 years before diagnosis. This offers an exciting PhD opportunity.
Our unpublished network-based modelling of the imminent cancer cases in this cohort (diagnosis <3 years) using network-based supervised stratification (NBS2) identified connected sets of proteins in epithelial-mesenchymal transition and inflammatory response pathways to help recognise cancer cases.
Objectives
The objectives of this PhD study are three-fold.
- Integrate multi-omics data into consistent frameworks
- Identify biomarkers: apply NBS2 and time-to-event (TTE) analysis to infer networks underpinning both early oncogenesis and late-stage disease. Benchmark our method with other machine learning methods.
- Validate findings in independent UK Biobank Olink proteins and metabolic data.
Novelty
Data are unique: molecular signals sampled 1–10 years before clinical diagnosis. This is also the first large-scale integration of proteomic, miRNA, and metabolomic data. Method is
novel. We may attempt pathway-based TTE modelling.
Timeliness
The use of blood-based miRNA for lung cancer screening is proven positive recently. This studentship will also synergise with a CRUK-funded Primer Award.
Training
The student will be trained in 1) multi-omics analysis approaches, 2) NBS2 and 3) TTE analysis.
Through the breadth and depth of these approaches, you will gain an impressive set of skills in mathematical modelling for early disease detection, which are in short supply for the pharmaceutical industry.
Enquiry
Please contact the primary supervisor Dr Tao You for any questions.
Further reading
Johnson et al. (2026) Machine learning-based proteogenomic data modeling identifies circulating plasma biomarkers for early detection of lung cancer. Commun Med (Lond). doi: 10.1038/s43856-026-01500-1.
Davies et al. (2023) Plasma Protein Biomarkers for Early Prediction of Lung Cancer, EBioMedicine, 93: 104686.
Zhang et al. (2018) Classifying Tumors by Supervised Network Propagation. Bioinformatics. 34(13):i484-i493. doi: 10.1093/bioinformatics/bty247.