Becoming an Expert: Oisín Boyle on extracting important information from noisy spectra
Oisín Boyle is a 1st Year PhD student in the University’s Department of Electrical Engineering and Electronics and part of the CDT in Distributed Algorithms. His PhD is on ‘Extracting Important Information from Noisy Spectra’:
What is a spectrum?
Within the context of my PhD, a spectrum is a way of measuring the amount of light there is across a range of wavelengths. A good way to think about it is by imagining a rainbow. The light is split into colours from red to violet. The red light is light with a high wavelength and the violet light has a low wavelength. If there is a lot of high wavelength light it will appear very red.
My PhD aims to develop and implement complex state of the art algorithms in parallel that can analyse spectra and return the gas composition producing them.
However, this is a highly complex task as there are over 700 different atoms, ions or molecular species potentially producing this. Moreover, there are multiple that can relate to a single data point in the measured spectrum; one measured peak cannot be indicative that a certain element is present with total confidence.
Consequently, a method that searches over the combinations of all the peaks is required. This method could be derived from popular machine learning techniques (such as artificial neural networks), Monte Carlo simulations, or Similarity Search systems.
The spectrums I am developing analysis algorithms for are very large. They can produce over 5000 measurements per second. Moreover, the spectral signal can become corrupted. Weaker parts of the signal get lost in ‘noise’ caused from unwanted electrical signals disrupting the data. If the signal is too strong, it hits a maximum value and then loses the correct shape.
The signal has a charge time (which can be altered). However, the ‘goldilocks’ signal strength does not exist all the time and the signal will inevitably become corrupted.
The other issue is the rate at which the data obtained. The complex algorithms used to analyse multiple spectra must be quick enough to deal with the real time data stream. Therefore, we require the algorithms to be optimally efficient. Of particular interest to me are parallel computing methods (methods that use multiple processors to increase computing power), which can be used to speed up the algorithms.
Life as a Doctoral student
The Distributed Algorithms CDT works with a combination of high-performance computing and data science. Both of these are becoming more popular and important in the modern world. As computing power increases and the number of processors in a computer increases, developing algorithms that work for a single processor will be inefficient and outdated.
Because of this, we are partnered with the Hartree Centre, a member of the Science and Technologies Facilities Council who work with UK industry through high performance computing, data analytics and artificial intelligence technologies. Therefore, the CDT is producing students ahead of the curve, ready to be future leaders.
Working within the CDT of Distributed Algorithms has been great. It is very helpful and reassuring to not be undertaking a PhD alone, as there are other PhD students in the CDT who work in similar fields. We also have meetings as part of the University of Liverpool Signal Processing Research Group, where we can share ideas or issues we have with post docs and Professors.
Industrial application of the PhD
As part of my PhD, I am working in partnership with Gencoa developer and manufacturer of equipment for thin-film material coating who are based in Liverpool.
Thin-film coatings are found all around us and are essential for the operation of a diverse range of technology - from the processors and screens of mobile phones to transparent anti-viral coatings (which is very important in the COVID-19 era). The company sell gas-sensing instrumentation that uses spectra to analyse the coating process.
During November 2020, I developed an algorithm that produces spectral signals that are more robust to noise. It does this by using signals from multiple charging times to artificially create a spectrum with less noise.
It’s also able to extrapolate the peaks that corrupts due to hitting the maximum value. Consequently, the algorithm artificially creates the goldilocks signal strength when it possibly never existed whilst maintaining the correct spectral shape.
This method is being implemented at Gencoa and was marketed at the Society of Vacuum Coaters as a High-Dynamic Range method, due to its similarity to HDR imaging.