27.05.2021 | Inside Data Science
Vanessa Didelez on a Core Element of Data Science: Statistics
In our interview, Vanessa Didelez talks about statistics, causal inference, and her research projects in the field of epidemiology.
What topics are you currently working on in your research?
I am working on a number of – applied and theoretical – projects on causal inference from observational (epidemiological) data where “time” plays a key role. For instance we are developing algorithms for causal discovery from cohort data, such as BIPS’ children’s cohort “IDEFICS/I.Family”; or we are advancing statistical methods to evaluate cancer screening programmes which have time-to-event outcomes; theoretical questions concern inference based on causally meaningful (as opposed to mathematically convenient) parameterisations of models.
How important is data to your research?
While part of my research focusses on investigating the theoretical properties of statistical methods, which does not in itself involve any data, I aim at considering and developing methods that are useful and targeted towards addressing practically relevant questions with real – and often in many ways imperfect – data.
What role does data science play in your research? Do you see yourself more as a user, a method developer, a basic researcher, or perhaps something completely different?
My research deals with (statistical) methods for analysing data – I would say that this is certainly a core element of data science. I develop methods for real & challenging applications in the context of epidemiological studies. In this work it quickly becomes clear that understanding where the data comes from, what its limitations are and to some extend understanding the domain of application is crucial.
Which data science methods and technologies are in the focus of your research or could also become interesting in the future?
It will become more and more important to combine methods for causal inference (which was an early part of artificial intelligence in the 90s) with machine learning and related approaches. This will allow us to deal with large and heterogenous data, reduce reliance on parametric assumptions, as well as exploit and combine new sources of data.
What are your main challenges in dealing with data?
A key challenge when addressing a substantive research question using available real-world data are the many limitations and imperfections: observational data contains numerous sources of potential bias. Developing methods that overcome these problems is of paramount importance.
And finally, what is your personal motivation for joining the Data Science Center?
This is a highly interdisciplinary field and we can learn so much from each other!
You can learn more about Vanessa’s activities in her talk “Causal Reasoning for Data Science“
in the Data Science Forum on 03.06.2021.
Prof. Dr. Vanessa Didelez
Professor of Statistics and Causal Inference
FB 03 – Mathematics and Computer Science
Leibniz Institute for Prevention Research and Epidemiology – BIPS