Going beyond a mathematical investigation of bias

Blog item

A version of this blog post first appeared on https://odvanderwal.nl/2023/positioning-bias.

When researchers study how biased language models are, they generally approach this in a mathematical or statistical way. For example, they could say that ChatGPT is gender biased if, when asked to write a story about a CEO or nurse,  it writes more than 50% of the time about that male CEOs are men and female nurses are women when asking it to write a story about a CEO or nurse. Another way that this bias study can be very mathematical, is when researchers look inside the language model using interpretability methods to see how it represents this biased information internally.

While this is a useful way to better understand why language models, like ChatGPT, may be gender biased, it is also useful to take a step back and to consider bias in NLP from a broader perspective. The analysis of bias is incomplete if we ignore the ethical questions and the sociotechnical context. Both the technical details of the model and the social aspects (the designers, users, stakeholders, historical and cultural context, company goals, etc.) are important to consider! In this blog post, we’ll discuss three such considerations that go beyond a mathematical approach:

  1. algorithmic bias is a sociotechnical problem,
  2. society is constantly changing and so is our conceptualization of bias,
  3. algorithmic bias is not simply a reflection of the data/society.


1. Bias is a sociotechnical problem

When should we consider the gender bias of an AI system as harmful? The implicit assumption in the AI debate is generally that we should aim for gender-neutral behavior, which is based on the idea that not differentiating between the genders defines or constitutes fair behavior. However, whether this is the case might strongly depend on the particular task we want athe  system to perform and its sociotechnical context (i.e., both technical and social aspects are important to consider).

For example, In translations, we might want the an AI system to consider the (grammatical) gender of the subject when translating a text, but not when in assessing the competency of job candidates when automatically filtering resumes! (In fact, whether we should want to use AI for automating these tasks is another question entirely.)

Our perspective may also change if we do not see the bias of an AI system in isolation, but as situated in the broader practices it is part of: We may find that the individual examples of bias do not paint the a full picture of the structural bias of the institutions, businesses, or organizations making use of it. Why does an AI system assign higher competency scores for the resumes of people more similar to the ones already working in the company? Is it because they are truly more competent, or is the training dataset skewed because of historical reasons and would more diversity actually benefit the company? In this light, we might even have to consider adding a counteracting bias to generate equal opportunities for different subgroups in a population, to compensate for the disadvantages these groups have.

Not all bias is unwanted, and there might be contexts in which we need it to reach certain goals. To formulate the (moral) standards for an AI system, we need to look at the broader context in which it functions, understand the way the AI system interacts with this environment, and consider how the entire system might contribute to unfairness or cause harm to particular groups or individuals. This also means that the current paradigm for analyzing bias in NLP is perhaps inadequate: Raji et al. (2021) make a compelling argument that benchmarks for evaluating AI systems are fundamentally limited, as these consist of decontextualized examples.

2. Society is constantly changing and so is “bias”

Ideally, the discussion about the norms and standards of a particular AI application are resolved before the build starts. But what counts as unfair or harmful behaviors are not stable societal factors that we can align our AI systems with. They are constantly changing, as the debate in society progresses,. and therefore a principal solution of the bias problem is simply impossible. Worse even, new biases can emerge if our AI systems do not adjust for such this changes (Friedman and Nissenbaum, 1996; Bender et al., 2021). This concern is especially apparent for very large language models, which are expensive to train and therefore reused for many downstream tasks (Bender et al., 2021).

Moreover, given the various applications that could make use of language technology, there is no way to have standards that fit them all. However, we can be transparent and detailed about the way a particular model is trained, including the dataset, so that this information is available in case of model transfer (Mitchell et al., 2019; Bender and Koller, 2020; Gebru et al., 2021; Bender et al., 2021. Furthermore, we need to develop technologies that allow us to counteract biases in the system whenever they do matter for the downstream task. But for this, we also need a clear understanding of how this bias comes about in the first place.

3. Algorithmic bias is does not only reflecting pre-existing bias

A popular argument in the AI community, is that the bias of a deep neural model simply reflects pre-existing biases that are present in the training data. However, we should not neglect the responsibility we have in designing and implementing these AI systems: Many forms of bias can emerge at the different stages of creating and deploying the language technology (see Hovy and Prabhumoye, 2021). Others have even pointed out that biased algorithms can change the world in profound ways. For example, Ensign et al. (2018) show how biased policing algorithms could result in more policing of certain neighborhoods, which in turn feeds back into new data reinforcing the earlier bias, leading to a ‘runaway feedback loop’.

Language technology is not merely reflecting society, but its implementations can be a part of it and even change it in unexpected ways. A well-known theme in the philosophy of technology, is that technologies ‘mediate’ our experiences and shape our world-view of “how to live” (Verbeek, 2005). Machine translation systems may dictate a world-view primarily of men, with women restricted to stereotypical occupations (Wellner, 2020), and search engines that only show men for the keyword “CEO” similarly shape our image of the archetypal business leader. Or consider the following example by this UNESCO/COMEST report from 2019:

“The ‘gendering’ of digital assistants, for example, may reinforce understandings of women as subservient and compliant. Indeed, female voices are routinely chosen as personal assistance bots, mainly fulfilling customer service duties, whilst the majority of bots in professional services such as the law and finance sectors, for example, are coded as male voices. This has educational implications with regards to how we understand ‘male’ vs ‘female’ competences, and how we define authoritative versus subservient positions.”

How we define bias and measure it, may also influence how we view bias itself. In the context of fairness metrics, Jacobs and Wallach (2021) refer to ‘consequential validity’, the fact that “the measurements shape the ways that we understand the construct itself”—which is often overlooked when designing a bias metric.

How we define and measure racial and gender categorizations, for example, also shapes how we view and act on these constructs in society; Viewing gender as a binary construct may be hurtful to non-binary communities (Costanza-Chock, 2018). (And see for a discussion of different perspectives on race see Glasgow, 2019.)

Algorithmic bias is an inherently complex phenomenon due to its sociotechnical and context-sensitive nature, which makes a precise definition difficult—yet a discussion of how it is defined is crucial when researching it (e.g., Blodgett et al., 2020, van der Wal et al. 2024). Researchers cannot resort to a ‘catch-all’ bias metric for understanding the bias, and mitigating the harms might require more than simply removing the biased information (Talat et al., 2022). It is even unclear whether it is possible to completely debias an AI system (see for a discussion of debiasing, for example, Talat et al., 2021).


Going even further, maybe the starting point should not be to ask ourselves how we can debias AI models, as phrased in this Dutch report on digitalization, but rather to focus on the larger questions that we have to answer as a society: How do we want to shape the world with language technology as a part of life? How can we design these AI systems such that they help create a more just society, instead of solidifying existing (or even leading to new forms of) systemic bias? Naturally, such a broad discussion about what entails fair behavior in AI systems, needs to involve not only AI researchers, but also various other experts from outside the technical domain.


Thanks to Wout Moltmaker for his helpful comments on this blog post.

Projects related to this blog

Bias across borders: towards a social-political understanding of bias in machine learning

Goal of this project is to provide a clear conceptual framework on the notion of bias in machine learning for AI researchers in the context of algorithmic fairness.
Responsible White
Bias White
Theory-driven White

Can NLP bias measures be trusted?

van der Wall, O., Bachmann, D., Leidinger, A., van Maanden, L. Zuidema, W. & Schulz, K.

Automating the Analysis of Matching Algorithms

Endriss, U.

Participatory budgeting

Improving Language Model bias measures

Explainability in Collective Decision Making

Read other blogs

Discover our researchers’ blogs—icons by each post indicate its themes.

No data was found