10.11.2021 | Inside Data Science
The Application of Data Science Methods in Material Sciences
In our interview, Wilke Dononelli talks about the importance of data for his research and the use of machine learning in quantum chemistry.
What topics are you currently working on in your research?
I am using quantum chemical atomistic simulations in order to understand catalytic reactions at surfaces, to predict promising new functional materials or to determine the atomic structure of new unknown compound.
How important is data to your research?
Data is very important in my field. First of all, a lot of calculations are needed e.g. when the catalytic mechanisms of a chemical reaction needs to be understand. These calculations will generate terabytes of output data. The benefit of the renewed interest in Machine Learning is that the community now starts to utilize this data in order to speed up calculations or make predictions in complementary research.
I am currently working on a new method, where we in cooperate experimental data to our quantum chemical calculations. For this project, we are dependent on experimental data from others.
What role does data science play in your research? Do you see yourself more as a user, a method developer, a basic researcher, or perhaps something completely different?
I am a mix of a user and a method developer. I want to answer questions that arise from some experimental findings. If a helpful code is already available, I am using this instead of spending time to develop something similar. Nevertheless, in most cases one has to modify an existing code in order to be able to use it probably. In addition, I am writing my own codes or parts of the code, if needed.
Which data science methods and technologies are in the focus of your research or could also become interesting in the future?
Currently I am using and working on two different projects were we develop and use on-the-fly trained machine learning (ML) models to drastically speed up our quantum chemical calculations. Currently we are focusing on using a Gaussian Process but might use neural networks in the near future. In one project the ML model is used to speed up a global optimization in the framework of an evolutionary algorithm. In the second project we are using the ML model in order to speed up calculations of numerical second, third and fourth derivatives.
What are your main challenges in dealing with data?
Since I am mostly preparing the data myself, the main challenge is to have access to computational resources. A few hundred to a few thousand CPU cores are needed on a daily basis.
In addition, the in cooperated experimental data needs to have a high quality and “noise” needs to be avoided.
And finally, what is your personal motivation for joining the Data Science Center?
I am a quantum chemist/material scientist that had the opportunity to learn a little bit about machine learning. I am able to develop new codes, but of course a person who learned data science already as a student might have a deeper knowledge and complementary ideas of how an effective code could look like. Therefore, I was joining the Data Science Center in order to find potential cooperation partners that would be interested in starting research projects together.
Thank you very much for the insights into your research in this interview, Wilke!
Dr. Wilke Dononelli
Senior Postdoc @ Hybrid Materials and Interfaces AG
FB 04 – Production Engineering