Making health research faster and easier with smarter tools – a plain language summary

Across the UK, the way healthcare providers make note of conditions, medications, tests, and treatments in electronic healthcare records (EHR) is through use of codes.

These codes act as standardised labels that replace written words, ensuring that everyone – across different hospitals, GP practices, and IT systems – records health information in the same way. One example of these is SNOMED CT (standardised nomenclature of medicine – clinical terms) codes, for example “195967001” is Asthma.

Codelists are combinations of codes that can be used to get more information about a person's condition - for example a codelist for Asthma may include the code above, a code for an inhaler prescription, a code for a steroid prescription, and a code for shortness of breath. This linked information can be used to tell that a person has asthma, and it is getting worse, so they have needed stronger medicines prescribing.

When putting codes together into codelists in the past, it has been very time-consuming - a group of healthcare professionals would all feed into the initial round of a codelist and then they would continue discussing until they all agreed. This process takes months of clinician time.

Codes also change all the time and therefore codelists need to change all the time to keep up, having regular meetings with busy healthcare providers to keep updating codelists would be inefficient.

The team behind DynAIRx - a project focused on improving care for people with multiple long-term conditions - developed a new method to speed things up. They created a digital method that uses trusted sources and a computer tool to build codelists quickly, with less effort from clinical experts.

They tested this approach by building 214 codelists for DynAIRx, covering around 14,000 codes. Normally, this would take months of expert time. With the new method, it took just 7–9 hours.

The method works by:

Starting with existing lists from reliable sources
Cleaning and organising the condition names
Asking experts to decide which conditions to keep, split, or group
Automatically finding matching codes
Using trusted sources to check and shrink the lists
Letting experts do a final review

This process helps researchers work faster, reduces mistakes, and makes it easier to share and reuse codelists in future projects.

The DynAIRx team has made the method and the codelists publicly available, so others can benefit too. This work supports better use of health data and helps build tools that improve care for people with complex health needs.

The full study can be found here.

We thank Saiqa Ahmed for her valuable insights and contributions in shaping this summary to ensure it is accessible and clear to a wide audience.

This study/project is funded by the National Institute for Health Research (NIHR) under its Programme Artificial Intelligence for Multiple and Long-Term Conditions (NIHR 203986). The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.