Project details

Language models White
Bias White

From Learning to Meaning

In this project, we create a new theory for explaining generic sentences (e.g., ‘Birds can fly’) and explore whether Language Model’s generic sentences can teach us something about how people express stereotypes.

Generic sentences (‘Birds fly’, ‘Lions have manes’, ‘Pitbulls are dangerous’) are omnipresent in language and express characterizing properties of groups and individual objects. As they communicate (stereo)typical (‘Lawyers are greedy’) and normative (‘Winners never quit’) information, these sentences give voice to and transmit often socially prejudiced generalizations (‘Jews are greedy’) and can have a huge societal impact. One goal of this project is to develop a formal semantic theory that describes the meaning of such sentences and to empirically test the results. 

A second goal is to develop a new behaviourist paradigm to test semantic theories for natural examples that express stereotypes by looking at language use. In particular, we look at which stereotypes are implicit in (large) language models. Given that these language models are trained on examples used by ordinary people, we hypothesize that the stereotypes of language models also teach us something about the stereotypes of these people.

Papers related to this project

How robust and reliable can we expect language models to be?
Which stereotypes do search engines come with?
Stepmothers are mean and academics are pretentious: What do pretrained language models learn about you?