Artificial Intelligence

Theory-driven AI emphasizes the necessity for truly understanding current AI systems, and for building future AI systems that are efficient, explainable and trustworthy.
Theory-driven Plum

Browse research content

These are samples of research projects, papers or blogs written by our researchers.

More about Theory-driven AI

Imagine that you are about to start a well-paid AI developer position at a cool tech company, and you spent years of hard work to get this position. Now that you’re busy with the new job, you might ask “who cares about the theory?”, or “why should I care?”. Here comes the short answer: A theory-driven approach to AI is not a luxury but a necessity for developing much more efficient, trustworthy and explainable AI systems.

For a more elaborate answer, keep on reading.

There are at least two beneficial aspects to theories. First, theory serves as a great set of tools for efficiently storing human knowledge. This is particularly true when it comes to abstracting away the relevant information from irrelevant details (e.g., a simple mathematical equation or theorem can express potentially infinitely many cases under certain conditions). Understanding the theory will, moreover, not only give you access to an immense amount of knowledge, but also a way to store your own ideas in more efficient ways. Second, theory helps you develop a deeper understanding of what an effective solution for a given problem means, and helps you position your candidate solution against the existing solutions more realistically, without the burden of trying out tons of other solutions blindly, and without eventually repeating an already existing one. 


These beneficial aspects gain even more importance when it comes to common challenges of developing AI systems that are trustworthy, explainable and fair. When it comes to widespread adoption of technology and acceptance from users, it is essential that the to-be-deployed AI system is trustworthy, secure and reliable, which means that it will work as intended. This is surely a challenging task, and a theory-driven approach contributes to the development of such an AI system in several ways: First by relying on established theories and principles, developers can design AI systems with a higher degree of predictability which is necessary to deploy them for large-scale use. Theory-driven approaches aim at understanding AI systems deeply. Such a purpose naturally aligns with the aim of explainable AI [4]. If developers can understand the system behavior at hand better, they can develop systems that provide explanations for the users better; this is crucial for them to understand how AI systems make decisions and whether they can be relied upon. Second, theory-driven AI can help identify potential vulnerabilities and risks in AI systems. This is not possible by only relying on experimental analyses since with such complex systems (for instance, a large language model (or LLM) often encompasses hundreds billions of parameters [5]), millions of things can go wrong in millions of different ways. We need certain theoretical guarantees (see for instance [1]). By drawing on well-established knowledge, developers can anticipate potential pitfalls and design AI systems to mitigate these risks. This proactive approach is essential for ensuring that AI systems do not inadvertently cause harm or engage in ethically undesirable behavior. Third, theory-driven AI extends our perspective on the problem, including limitations and boundaries, saving us from relentlessly trying theoretical impossibilities. For instance, developing fair machine learning classification algorithms is a common ambition and there are many fairness metrics out there, meaning that developing one that satisfies all is a futile goal (see [2]). For another instance, your boss might inadvertently ask you to develop an efficient solution based on a cutting edge machine learning algorithm to solve a problem which happens to be a very hard problem [3]. Having the theoretical knowledge of that problem you could recognise it, and instead of spending endless amounts of hours to come up with a general solution, you could tell your boss that there is no known efficient algorithm, and instead you can work on developing an AI algorithm that works well enough for certain instances. This is surely a more effective approach both for you and your boss(!). We’d argue that the figure above is a better justification than the figure below. What do you think?

With these perspectives in mind, we argue that theory-driven AI is not a luxury but a necessity in pushing forward the scientific and technological innovation in developing fair, explainable and responsible AI systems. Hence, “theory” is one of the main components of CerTain.


  1. Wen-Chi Yang , Giuseppe Marra , Gavin Rens and Luc De Raedt, Safe Reinforcement Learning via Probabilistic Logic Shields, In Proceedings of IJCAI 2033 (Distinguished Paper Award) (
  2. Pushing the limits of fairness impossibility: Who’s the fairest of them all? Brian Hsu, Rahul Mazumder, Preetam Nandy, Kinjal Basu, in Proceedings of NeurIPS 2022 ( 2303.03226.pdf)
  3. Computers and Intractability: A Guide to the Theory of NP-Completeness, by Michael Garey and David S. Johnson, W.H. Freeman, 1979
  4. Explainable AI (XAI): A Systematic Meta-Survey of Current Challenges and Future Opportunities, Waddah Saeed, Christian Omlin, Knowledge Based Systems,  2023 (
  5. A Survey on Large Language Models, Zhao et al.  2023 (

Projects within this research theme​

Automated Reasoning for Economics

In this project, we develop algorithms to support economists who try to design new mechanisms for group decision making.
Responsible White
Fairness White
Algorithm White
Theory-driven White

Bias across borders: towards a social-political understanding of bias in machine learning

Goal of this project is to provide a clear conceptual framework on the notion of bias in machine learning for AI researchers in the context of algorithmic fairness.
Responsible White
Bias White
Theory-driven White

InDeep: Interpreting Deep Learning Models for Text and Sound

Goal of the project is to find ways to make popular Artificial Intelligence models for language, speech and music more explainable.
Explainable White
Theory-driven White

Improving Language Model Bias Measures

Many researchers develop tools for measuring how biased language models are; in this project we work on improving these tools
Language models White
Bias White
Theory-driven White

Can NLP bias measures be trusted?

van der Wall, O., Bachmann, D., Leidinger, A., van Maanden, L. Zuidema, W. & Schulz, K.

Automating the Analysis of Matching Algorithms

Endriss, U.

Participatory budgeting

Improving Language Model bias measures

Explainability in Collective Decision Making

Papers within this research theme

Quantifying Context Mixing in Transformers
Reclaiming AI as a theoretical tool for cognitive science
Can NLP bias measures be trusted?
Automating the Analysis of Matching Algorithms