Interview: What does a Future Leader of Distributed Algorithms look like? The CDT Director shares his thoughts. | Stories | Centre for Doctoral Training in Distributed Algorithms

Simon is the Director of the EPRSC Centre for Doctoral Training in Distributed Algorithms (CDT) at the University of Liverpool. The CDT plans to recruit up to 60 PhD students and train them to be future leaders of the next generation Data Science. Read our interview with Simon to discover the type of ‘unicorn’ he is looking for...

Distributed Algorithms: the future of Data Science

The CDT’s vision is that our graduates will become future leaders in Distributed Algorithms meeting the needs of government, academia and industry via an innovative Data Science, AI and Machine Learning training programme in a supportive cohort environment. Our PhD students work with academics and our project partners to produce novel solutions to tough data science challenges.

In order to achieve the vision, every PhD studentship is aligned with two academic supervisors who support the student’s objective to develop novel algorithms that can be deployed across future computing architectures and a project partner to ensure that the research developed is useful and used.

Read our in-house interview with the CDT Director to discover more about the CDT’s vision for unicorns…

Three key questions to get under the bonnet of a Distributed Algorithms PhD

What does Distributed Algorithms mean to you and why do we need to change how we train data scientists?

In my experience and understanding from my network of partners, data scientists and High-Performance Computing (HPC) engineers can be recruited into industry: it’s hard but it can be done. These types of people are not common, but they are, to some extent, a commodity. It’s near impossible to find people who understand both data science and high-performance computing.

Fundamentally data scientists do not focus heavily on fully exploiting future computing capabilities and HPC engineers don’t often have oversight of or involvement in how data scientists exploit the computational resources to do their job. In order to turbo-charge the potential of data science, we need scientists that can develop novel algorithms and understand how to take advantage of future computing architectures to ultimately make better decisions, faster.

The CDT and our partner community want to train people who understand both. These future graduates will be highly skilled, very valuable and will fly out of their training and do things that they see as hard and others may believe to be impossible. People assume hard things are impossible but they’re not and can be turned into reality. This type of knowledge and expertise will be a real game changer. If we’re going to generate 60 ‘future leaders’ they can’t be like the leaders of today - they’d just sit alongside people already doing it. My future sees a Leader of ‘Distributed Algorithms’ as a Chief Technology Officer of the next generation tech company….

The CDT arose from an EPSRC CDT call and two topics resonated with our plan –"Future Computing Systems" to move "Towards a Data-Drive Future”. The CDT needed a word or phrase to describe these two terms. Seeing as the CDT is designed to train people in the intersection of these two topics, the phrase ‘Distributed Algorithms’ was coined.

How will our PhD students evolve into future leaders? What type of individuals are we looking for? Do we have current examples that demonstrate the expected trajectory?

Progression is down to luck, via chance, or it can be engineered. We need to foster the ‘Andrew Ngs’ of 20 years’ time. These individuals need a route from PhD to leader. They’ll need to be able to move quickly through the ranks of future tech companies and the rate of progression is the crucial enabler to get to higher positions further - to get further, you need to move faster. Credentials and experience are essential, we’ll provide a runway with a trajectory that will enable publications, patents and other PhD generated artefacts.

Our CDT provides PhD students with the theory and technical experiences that arm them with the understanding and tools needed to excel. We will train our students in the softer skills required to work with industry and government through direct industrial supervision, placements and an entrepreneurial training programme which provide advice on, and practice in, the skills that industry values.

Another factor that influences our student’s development is the general undertaking of a research project, ie the PhD. It’s trial and error and the work that our students do contributes towards a greater outcome. For example, we work as a team, implement theory and generate new ideas. All of this supports the development of our technical skills. It doesn’t matter if we get things wrong – if we’re working on a mini-project with the University’s Library Systems, for example, and our data and analysis tells us that that there is one type of student accessing the Library services (which we know is clearly wrong) then we’ve tried and can have another go! It really doesn’t matter too much in a training environment. Whereas if you were undertaking the same research as a consultant then a week’s worth of work might cost thousands and you will be in trouble if you don’t achieve the expected results for your client. The research environment facilitates training opportunities that ultimately gives our students the confidence and understanding needed to tackle a variety of tough data challenges.

For example, one of our students is working on the EPSRC Big Hypotheses Research Programme. We’ve figured out a configuration for a pre-existing technique that is potentially game changing in terms of the ability to use big computers to tackle certain problems, that can give speed-ups of hundreds of thousands routinely. This turns problems that people perceive to be impossible into those that are possible, and things run in the time it makes to a cup of tea! The issues we’re facing is how to capitalise on this ourselves. Commercialisation of research is challenging and we’re currently considering our best strategy.

The CDT students are engaging with the wider research team and contributing to active research programmes and are watching it happen! It doesn’t matter to the student if a particular project falls flat on its face. They will have witnessed it all, and will be right in the melee of things, experiencing it all from afar. There is no downside to undertaking a PhD with our CDT. The visibility of actually doing what they’re being trained to do is a fabulous opportunity and too good an opportunity to pass up – it’s gold dust!

It is predicted that these ‘unicorns’, these highly skilled data scientists will be in growing demand* in the next few years, what do our partners say they need now and what will make the CDT graduates stand out from the rest?

It is not impossible to recruit people with skills, but it is near impossible to find people with the Distributed Algorithms mix of skills. The CDT was fortunate to have numerous industrial companies who wanted one or more of the Distributed Algorithm graduates before we won the proposal. They were eager to work with us - these unicorns are desperately needed in government and industry and are in growing demand.

Ultimately, the type of unicorn we’re looking for will end up with the ability to develop innovative algorithms and confidently implement them in high performance computers that will exist 20 years after our students graduate. The number of processors to train neural networks is doubling every four months – in ten years, the HPC will be a trillion times bigger than it is now. If this projection continues, we’ll have the people that can capitalise on these computational resources, and the winners of the race will be our students.

I foresee that any graduates from our CDT will be in high demand and I envision that partners will be scrabbling to grab these people and will want to continue to recruit. On one level our students are undertaking a four-year interview, and ultimately their PhD will contribute towards both the partner and student making informed decisions related to employment immediately after the PhD.

All our graduates will have had in-depth training, worked on real-life industry problems with the correct software and hardware to complement their training.

We will make a massive difference and influence on the UK economy, well-being and the world, and I’m passionate about the massive difference the students will make. Hopefully this will appeal to the future leaders of tomorrow…

If we can generate these unicorns, they will fly!

Developing Distributed Machine Learning Algorithms

As you can ascertain from our interview with Simon, our PhD’s are focused on developing distributed machine learning algorithms and applications in the field of GPUs, FPGAs and human-machine teaming concepts, providing opportunities to test solutions with existing systems and platforms across industry.

The trajectory of our current students is that of a future leader, and our vision is to see our current PhD students developing as highly-skilled data scientists through the training they will receive from our CDT community.

Presently, we have two cohorts established with 17 students, with recruitment for Cohort 3 currently in the first round. Please visit our Projects page for an overview of the types of projects our students are working on alongside their supervisors and partners.

Want to become part of our CDT community?

If you’re looking to become part of the CDT community (as a PhD student, partner or academic supervisor), email kelli.cassidy@liverpool.ac.uk

“Those who can imagine anything, can create the impossible.” Alan Turing

*source: Future of Jobs Report: World Economic Forum October 2020

Author: Sara Parker, Centre Support Officer, Distributed Algorithms CDT.