Augmenting predictions and real-time classification of atmospheric composition using machine learning


  • Supervisors: Dr David Topping
    Dr James Allan
    Dr Rami Alfarra
    Prof Hugh Coe
  • External Supervisors: External supervisors and institution (including Title): Prof Ilona Riipinen (Stockholm University), Dr Jacqui Hamilton (University of York)

  • Contact:

    David Topping [david.topping@manchester.ac.uk]

  • CASE Partner: Waters Mass Spec Ltd

Application deadline: 30 May 2018

Introduction:

Air pollution and climate change are two key socio-environmental drivers that represent some of the biggest multidisciplinary challenges in science, society and the economy today.  The need to understand the chemical and physical processes in the atmosphere that dictate the impacts of both has created a wide range of experimental platforms over the past 2 decades.  However, whilst these facilities persistently identify and hypothesise new processes and compounds deemed important to improve our understanding of change, the research community is now struggling to use the data and subsequent information in a truly meaningful way.

Variations in components that dictate the composition of the atmosphere are key to defining both air-quality and contributions to climate change.  Whilst around ~10’000 different compounds have been measured, theoretical chemical mechanisms predict the occurrence of many millions. Whilst models predict the concentration of evolution of individual compounds, it is impossible to measure every compound. Therefore, new analytical instruments have been designed that derive a chemical signature of gaseous mixtures and particulate matter, such as complex mass spectra.  Not only is this information not directly comparable with the aforementioned model outputs, but extracting key contributing species to any chemical signature is reliant on techniques with no real test of fidelity. Atmospheric science is reaching a crossroad of exploration. Attempts to address climatic and health impacts imply improving the knowledge on atmospheric composition and properties yet, sooner or later, we must take decisions on what to do with the complexity of both.

This forms the crux of this project where you will use emerging machine learning (ML) methods, and data from an industrial partner, to create new software platforms for predicting mass spectra and rapidly convert measured mass spectra into distinct composition categories for real-time classification. Not only will this benefit wider atmospheric research, but deliver a product that will benefit our industrial partner across a range of disciplines.

Project Summary:

ML methods are split into ‘supervised’ and ‘unsupervised’. Unsupervised methods find hidden patterns or intrinsic structures in data (e.g. cluster analysis). Supervised methods build a model that makes predictions based on a known set of input data and known responses to the data (output). These methods include classification algorithms (predicting a discrete response such object detection or, in our case, distinct source contributions) or regression algorithms that predict continuous responses based on other input data (mass spectra predictions based on chemical composition).

In this project you will use all methods now available in open-source packages, using data from our partners Waters Mass Spec, to build new classifiers and regressors that could be used to deliver novel insights into measured and predicted composition. Your work will follow on from our existing proof of concept work recently published (Topping et al 2016, 2017).

Who are we looking for and why should they apply?

We are looking for someone who is enthusiastic to build software. We are not expecting you to start with all the required skillsets. A PhD is also the chance to build on your undergraduate training, and we will ensure you have the opportunity to do that in order to meet the project goals. Your project partners and co-supervisors are all world leaders in their fields, giving you a rich environment across multiple disciplines, including experience of benefiting and working with those in industry through a placement with our project partner Waters Mass Spec. You will have the opportunity to travel whilst presenting your work on the international stage at both EU and US conferences.

Software development and broader machine learning skills are becoming highly attractive to employers. In this project you will have the chance to develop your own software, working with key partners in the fields of climate, air-quality, and data science. Through this process you will develop both ‘hard’ and ‘soft’ skills, with a demonstrable ability to adhere to best practices in software development and community engagement. Through a joint partnership between Manchester University and the Alan Turing Institute of Data Science in London, you will have the opportunity to spend time liaising with world-class data scientists and engineers through your PhD.  This will supplement the expertise now offered by our local Data Science Institute.

References:

Topping, D., Barley, M., Bane, M. K., Higham, N., Aumont, B., Dingle, N., and McFiggans, G.: UManSysProp v1.0: an online and open-source facility for molecular property prediction and atmospheric aerosol calculations, Geosci. Model Dev., 9, 899-914, https://doi.org/10.5194/gmd-9-899-2016, 2016.

Topping, D. O., Allan, J., Alfarra, M. R., and Aumont, B.: STRAPS v1.0: evaluating a methodology for predicting electron impact ionisation mass spectra for the aerosol mass spectrometer, Geosci. Model Dev., 10, 2365-2377, https://doi.org/10.5194/gmd-10-2365-2017, 2017.

Apply Now