Artificial Intelligence

Verified AI technology comes with guarantees that it does what it is supposed to do, without fail. For "verifiable AI" technology, we can make assessments about whether it comes with such guarantees.
Verifiable Plum

Browse research content

These are samples of research projects, papers or blogs written by our researchers.

More about Verifiable AI

As has become increasingly clear to the world, Artificial intelligence systems have applications in all sorts of settings. For some of these applications, even the slightest mistake can have life-threatening effects. Think of self-driving cars, for example, where a mistake can cause a lethal accident. Whenever AI technology is used in such safety-critical systems, it is essential to verify that an AI system does what we want it to do.

This is not exactly the same as explainable AI, but there are connections. Being able to explain how an AI system reached a particular output can help us improve its behavior in the future, for example. For safety-critical systems, it is too weak a requirement that the system only provides an (after-the-fact) explanation. Instead, we would want a guarantee that certain (erroneous) behavior does not occur.

The issue of verifiability has been important for software and hardware systems, even also before modern AI systems became popular in the last decade or so. The Intel Pentium Bug provides an infamous example that shows the importance of verification: In a small number of cases, an Intel processor was found to make mistakes when dividing floating-point numbers. Intel recalled the defective processors, costing the company hundreds of millions of dollars. Subsequently, hardware and software companies started using mathematical tools to verify the behavior of systems to prevent similar disasters.

The possible impact of erroneous behavior in AI systems is immensely larger than for most other systems. Typical AI systems are extremely complex, making it hard to even anticipate all dangerous behaviors they could display. AI systems are also employed in settings where their output has a direct and immediate influence on people’s health and lives, which multiplies the impact of mistakes.

There are various methods that can be used to verify the safety or correctness of AI systems’ behavior. We will focus on the method that provides the strongest type of guarantee: a mathematical proof that undesired or dangerous behavior does not occur. This is a very high bar, so this is very challenging. To achieve such mathematically proven guarantees, tools from the area of formal methods are used.

Formally verifying the behavior of a system generally consists of three steps: Firstly, one makes a (faithful) mathematical representation (called “model”) of the system. Secondly, one expresses the required behavior in a mathematically precise (logical) language. Thirdly, with these – by using automated reasoning methods – one finds a proof that the model only allows behavior that matches the required behavior

Various forms of automated reasoning can be used to provide mathematically proven guarantees on the behavior of systems. Two typical examples are model checking—computing whether a logical statement is true in a given model of a system—and automated theorem proving—computing whether one statement logically follows from other statements. The field of Computational Logic is a core driver behind such automated reasoning methods.

For AI systems, there is an additional factor that makes verifying behavior via formal methods even more complicated: AI systems operate in very complex environments, involving humans and/or other AI systems. The possible behavior of a system’s environment also needs to be modeled mathematically, and this needs to be taken into account when reasoning about the system’s behavior.

There are plenty of other elements that make verified AI extremely challenging to achieve [see, e.g., Seshia et al., 2022]. Important examples of such elements are:

  1. the size of internal models that typical AI systems such as Large Language Models use—put in technical terms: the high-dimensional nature of the parameter/state space of systems;
  2. the high degree of uncertainty that is inherent in the context in which AI systems are employed; and
  3. the complexity of even defining what constitutes safe and/or desired behavior in the intricate settings where AI systems are used—and thus also the complexity of expressing it in a mathematically precise way.

For a brief illustration of this difficulty, let’s think of the example of self-driving cars again. We would want the car to avoid hitting pedestrians (whenever possible). The first challenge is defining what a pedestrian is, which involves non-trivial challenges from perception and commonsense reasoning. Also, the internal working of the self-driving car typically depends on a large neural network, whose behavior is not easily modelled or analyzed computationally. And we want the car to behave safely in all sorts of environments, where much is uncertain (e.g., how humans behave, how many other self-driving vehicles there are and how they behave, etc.).