Innovation in science and technology has always been a great source for my motivation.
The first lesson that cultivated research when dealing with innovations, happened at the age of 7, when a sundial, which I proudly presented to my father as my invention, was discarded as plagiarism. This was in the pre-internet era. Nowadays, with instant access to the information, it is much easier to climb up on the shoulders of giants, by keeping abreast with the state of art.
I have been approaching innovations with a skillset evolving from mathematical physics applied to nonequilibrium statistical theory through programming and computations in materials science to finally consolidate my passions in the data-driven research with machine learning.
In this field, we have the most powerful tools to date to learn the intricate relationships in the wealth of available data to unveil the patterns that may lead to discoveries. Indirectly, the statistical methods of machine learning facilitate research by providing more efficient computational tools.
I believe, that by the synergy of the methods from the established fields with machine learning we can accelerate the progress of human achievements.
There are several areas, where we can benefit from this interdisciplinary approach.
Supervised machine learning.
Supervision entails learning the relationships between the known entities: inputs and outputs, structures and properties, assumptions and results, etc.
By exploiting deep neural network to learn the results of quantum mechanical computations - atomic structures and energies, we can speed up the time-consuming calculations for atomic systems with the machine-learned interatomic forces. This affords the unprecedented scaling of computations both time and size-wise.
Unsupervised machine learning.
We rely on statistics to reflect the patterns within data, that we cannot label as e.g., inputs and outputs. This occurs when we do not have sufficient knowledge of the data, e.g. we lack negative samples to classify the data (as in binary classification). Technically, we strive to match the accuracy of predictions with unsupervised learning to the supervised technique, when the data will be truly machine learned. Now, with precision around 80%, we already can identify the groups, clustering and anomalies in data without labelling it, i.e. in an unsupervised fashion.
In particular, this has afforded unsupervised learning of the explored chemistry (inorganic crystal database with > 200'000 structures) for pattern identification and prediction of unexplored phase fields with a high likelihood of discovery of new compounds.
Graph neural networks.
We can capitalise on the connectivity within the data to augment the information about the groups, learn more about their members and connections between them. This technique can be used, for example, to learn about the stability of crystal structures within a phase fields, or in general, for data augmentation for downstream of the supervised learning.
I am fascinated by the endless opportunities that these techniques offer for research and technology, in terms of their applications to new areas, as well interdisciplinary frameworks, exchanging ideas and methods between the variety of the fields.