PhD Studentships

We offer several PhD projects in Biostatistics. Below is a list projects that we currently offer.

Some projects are fully funded (for UK/EU students; non-UK/EU students will need to pay the excess in tuition fees), some are self-funded (the applicant is asked to pay all the fees).

Interested individuals are encouraged to contact the supervisors of individual projects to gain further information.

Furthermore we invite potential applications for other projects that fit within our expertise.  The Department of Biostatistics offers supervision to PhD students in a wide variety of research areas.  Particular areas of expertise include clinical trials methodology research, evidence synthesis, health informatics, multivariate data modelling, joint longitudinal and event history modelling, statistical genetics and pharmacogenomics, prognostic modelling and causal analysis (see our website for our expertise).

  • Several self-funded studentships in Biostatistics


    Self-funded project: Complex treatment by covariate interactions in network meta-analysis.

    Funding: Self-funded

    Deadline: We are accepting applications and reviewing them as they arrive

    Primary supervisor: Dr Sarah Donegan (Department of Biostatistics, University of Liverpool).

    Secondary supervisor: Prof Catrin Tudur Smith (Department of Biostatistics, University of Liverpool).

    Project description: For most diseases, many treatments exist. Network meta-analysis (NMA) can estimate the relative effects of all treatment pairings even when treatments are not compared in the same trial. Therefore, NMA has huge potential because it is useful for all clinical fields.

    It is common to explore treatment by covariate interactions in meta-analyses. Interactions can be included in an NMA model to evaluate whether each treatment effect varies with a covariate (e.g. a patient or methodological characteristic, such as, disease severity or allocation concealment).

    The benefits of including interactions in NMA can be substantial. The model can produce the relative effects of all treatment pairings for each covariate value. For example, including an interaction for disease severity (i.e. severe or non-severe) could give one set of the relative effects for patients with severe disease, and another set for patients with non-severe disease. This allows different recommendations to be made for different patient groups; personalising treatment in this way can benefit patients. For example, for the treatment of epilespy, sodium valproate is recommended for patients with generalized seizures whereas carbamazepine is advised for patients with partial seizures. Furthermore, when heterogeneity (i.e. variability across trials) or inconsistency (i.e. variability between direct and indirect evidence) is detected in the NMA without interactions, including interactions provides the opportunity to assess whether the covariate reduces this variabiltiy; this can help analysts understand how best to analyse data and draw valid clinical inferences.

    Currently, research has focused on including interactions for a single covariate (e.g. disease severity), rather than numerous covariates simultaneously (e.g. disease severity and dose). However, it is unlikely that one covariate would cause all heterogeneity or inconsistency if it exists, but instead several covariates would contribute. Therefore, when the purpose of including interactions is to explain variability, it seems sensible to include multiple covariates simultaneously.

    Additionally, current NMA publications explore linear interactions, rather than non-linear interactions. A non-linear interaction is observed when the graph of treatment effect versus a covariate is not a straight line. For example, a review showed that the graph of relative risk of mortality versus BMI was a j-shaped curve. In such cases, if a linear interaction is fitted, the analyst may fail to detect that an interaction exists and this could lead to incorrect clinical guidance.

    The overarching aim of the project is to develop methodology for including multiple, complex treatment by covariate interactions in NMA. The student could develop NMA models including interactions for several covariates and non-linear interactions; highlight the underlying modelling assumptions; develop methods to assess the assumptions; and demonstrate methods using real and/or simulated individual patient data and aggregate data.

    Any enquiries relating to the project and/or suitability should be directed to Dr Donegan (

    Person specification: The successful candidate is likely to hold a 1st or 2:1 degree in a relevant discipline (statistics or mathematics). A Masters degree in Statistics would be desirable. An understanding of statistical models is essential. Experience of coding in a statistical package (e.g. R, SAS, stata, Winbugs) is desirable.

    Training and support: write and publish their research in journals;

    • present their research findings in person;
    • understand the research topic;
    • write statistical code to apply the models in Winbugs and/or stata;
    • meet and network with other researchers.


    Self-funded project: Identifying Copy Number Variants using Whole Exome Sequencing Data.

    Funding: Self-funded

    Deadline: We are accepting applications and reviewing them as they arrive

    Primary supervisor: Dr. Anna Auer-Fowler (Department of Biostatistics)

    Secondary supervisor: Prof Andrew Morris (Department of Biostatistics)

    Project description:  

    Copy Number Variants (CNVs) are a common form of genetic variation and are known to contribute to genetic diseases. Whole Exome Sequencing (WES) is a relatively cheap form of genetic sequencing which targets just the coding regions of the genome. WES is becoming increasingly popular, particularly in clinical applications where the causal genes are known.

    Identifying CNVs from WES is currently unreliable; therefore CNVs are often ignored in WES studies or detected by an alternative and potentially costly technology. Therefore, improving CNV calling from WES data has the potential to reveal important CNVs and reduce the cost by removing the need for alternative technologies.

    The break points of the CNVs generally lie outside of the coding regions targeted in WES, therefore it is currently assumed that the signals associated with the break points will not be observed. However, WES generates a large number of off target reads (40-60% of all reads) some of which will contain additional information for the identification of CNVs. These off target reads have proved informative in other applications but are largely ignored in the field of CNV detection.

    The aim of this project is to improve CNV calling from WES data by incorporating multiple signals and using all reads generated. Additionally, these methodological improvements will be applied to large WES studies and therefore contribute to our understanding of the role of CNVs in complex human traits.

    Any enquiries relating to this project and/or suitability should be directed to Dr Fowler (

    Scientific objectives:

    1. Develop a statistical model for CNVs in WES data which integrates multiple signals from on- and off-target reads.  Bayesian approaches are effective in incorporating prior information, such as sequence content, and hierarchically linking multiple samples, and therefore adoption of a Bayesian framework will increase robustness of the model. The 1,000 genomes will act as a ‘gold standard’ data set for bench marking and optimization.

    2. Implement efficient software for this model, allowing it to be applied to large numbers of samples.

    3. Apply it to: (i) 2,500 WES from the Estonian Biobank, for which detailed disease phenotypes and lifestyle data are available; and (ii) 52,000 WES from the T2DGENES Consortium to study the contribution of CNVs to T2D risk and related metabolic traits.

    Person specification:  The successful candidate is likely to hold a 1st or 2:1 degree in a relevant discipline (statistics or mathematics or computing or bioinformatics) preferably with a Masters degree. Experience of programming is essential (e.g. R, C++, Python).

    Training and support: The student will receive support from supervisors to enable them to understand their research, publish their work, attend scientific conferences. Further training in statistics and genetics will be provided through targeted courses run by the Department of Biostatistics and the Institute of Translational Medicine. Additionally, Liverpool University run courses on broader subjects such as scientific writing and computing programming skills if required. Being embedded in the statistical genetics group will allow the student to benefit from the expertise of the group as a whole.  The student will receive broader exposure to statistical genomics as part of the North of England Genetic Epidemiology Group (NEGEG), which offers the opportunity to younger researchers to regularly present their research and to network with other students and postdoctoral researchers based at universities in the North of England. 


    Self-funded project: Development and application of methodology for “polygenic risk” prediction in pharmacogenetic genome-wide association studies.

    Funding: Self-funded

    Deadline: We are accepting applications and reviewing them as they arrive

    Supervisor: Dr Andrea Jorgensen (Department of Biostatistics)

    Secondary supervisor: Prof Andrew Morris (Department of Biostatistics), Prof Sir Munir Pirmohamed (Department of Molecular and Clinical Pharmacology)

    Project description:  The proposal is focused around development of “polygenic risk scores” for clinical outcomes in pharmacogenetic association studies.  These approaches have been successfully applied in genome-wide association studies (GWAS) of complex human traits, but have predominantly focused on binary outcomes (presence/absence of disease) and quantitative measures (such as anthropometrics and lipid profiles).  However, in pharmacogenetic studies, the outcome of interest is often more complex, such as categorical “sub-phenotypes” (e.g. severity of adverse drug reaction) and “time to event” data (e.g. survival time after clinical intervention). 

    For rare clinical outcomes, such as severe drug-induced hypersensitivity, we also expect a major contribution of rare genetic variants of large effect, which are not widely incorporated in polygenic risk scores for complex traits.

    The primary aim of this project is to develop and apply methodology for polygenic risk scores for complex pharmacogenetic outcomes (including categorical and time to event data).

    Scientific objectives: In order to achieve these aims, the primary objectives of this project are: (i) to adapt methodology previously proposed for polygenic risk scores in the context of binary and quantitative outcomes to complex clinical outcomes in pharmacogenetic studies (including categorical and time to event data); (ii) develop novel methodology to build polygenic risk scores on the basis of gene-based analyses of rare variants; (iii) evaluate utility of incorporating prior biological information on genes/variants associated with related outcomes into polygenic risk scores; and (iv) apply these approaches to a pharmacogenetic GWAS undertaken at the University of Liverpool.

    Person specification:  The successful candidate is likely to hold a 1st or 2:1 degree in a relevant discipline (statistics or mathematics). A Masters degree in Statistics would be desirable. Some experience of working with genetic data would be desirable but not essential. Experience of coding in a statistical package (e.g. R, SAS, stata, Winbugs) is desirable.

    Training and support: The supervisors will provide continuous support to the student, lending their expertise and extensive experience of developing and applying statistical methodology to the analysis of genetic data. The student will also receive training on analysing genetic datasets by attending training workshops (one at University of Liverpool and one at Sanger Institute, Cambridge), and on working with rare variants by attending a training course on this topic (University of Liverpool). The student will also have the opportunity attend training workshops on working with time-to-event data, if required. More general training, such as training on scientific writing and on presentation skills will also be provided through the University of Liverpool’s Doctoral College.

    Links: and Statistical and Pharmacogenetics Research Group (

    Self-funded project: Personalised medicine and stochastic control methods to improve dose estimation.

    Primary supervisor: Dr Steven Lane

    Secondary supervisor: Prof W Hope (University of Liverpool)

    Funding: Self-funded

    Project description:  Dosing for most medications are usually determined on a standard dose, implying that a dose range is suitable for all patients. However, it is becoming more widely accepted that individual doses are both more beneficial to the patient and more cost effective to the health service. An example of this is the drug Warfarin, a commonly prescribed anticoagulant, which has been demonstrated to have high inter-individual variation with doses ranging from 1mg up to 20mgs. Consequently to understand the workings of a medication, it is important to understand the absorption, distribution and elimination of the drug to, in and from the body.

    Pharmacokinetics is used to understand how the body works on substance and pharmacodynamics is the study of what effect the substance as on the body. Including this information in dose estimation algorithms will allow more robust dose estimates.

    Control theory is used in engineering to control dynamic processes. A sub-branch of this is stochastic control that also incorporates the uncertainty in the process measurements into the control strategy. The aim of these methods is to control the system to meet some pre-specified target value. For example, in drug dosing it could be to ensure that the drug concentration within the blood stream is above a therapeutic concentration. Stochastic control also allows the control estimates to be updated when new information becomes available. This could be when a new measure of the plasma concentration becomes available or when information not available at the initial concentration, such as genotype, becomes available.

    Scientific objectives: The aim of the proposed PhD is to incorporate information from pharmacokinetic/pharmacodynamics models, along with data on demographic and clinic factors into stochastic control models, using the Kalman filter approach. The aim will be to produce robust adaptive algorithms than can be used to personalise the medical treatment of individuals and in particular paediatrics. As part of the PhD it is hoped the student will be able to take advantage of previous collaborations with USC, laboratory of applied pharmacokinetics.

    Person specification: During the PhD the student will learn pharmacokinetic and pharmacodynamics modelling using either NONMEM or P Metrics software. The development and modelling of stochastic control algorithms.

    Following the PhD the student will be familiar with pharmacokinetic/ pharmacodynamics modelling. Control theory and specifically stochastic control and Kalman filters and the challenges of personalised medicine particularly in the paediatric population. The student will also be supported to publish research papers and make conference presentations.