Cognitive Science of Philosophy Symposium: Metaethics and Experimental Philosophy

After speaking at length about how metaethics could profit from experimental results, it might not come as a surprise that I believe that even an idea this plausible requires empirical support. Ultimately, whether thick concepts have the disposition to guide actions is a matter of their psychological effects on people. Judith Martens and I have started to investigate the action-guidingness of thick concepts. We believe that to develop a proper metaethical theory of thick concepts and their relationship to actions, we need to understand a) whether there are circumstances in which thick concepts provide reasons for action, b) whether there are circumstances in which thick concepts do not provide reasons for action, and c) how these two classes of circumstances differ from one another.
Kevin Reuter and I happily admit that for the time being, we can only speculate as to why the polarity effect occurs. We outline several possibilities in our paper. Together with Lucien Baumgartner, we are in the process of exploring these explanations more systematically and we also developing new experimental approaches to evaluative language, including other experimental paradigms and corpus linguistic approaches (see Reuter, Baumgartner & Willemsen, ms). The investigation of thick concepts is very much in its infancy. However, by applying a very simple experimental paradigm to sentences that so far have only been used as ‘thought experiments’ and philosophical ‘intuition pumps’, we have already empirically challenged two of the most prominent views on thick concepts, as well as the shared assumption that positive and negative thick concepts communicate evaluation in the same way. This seems to be enough to motivate a much larger empirical investigation of thick concepts and normative language more generally.
Reuter, K., Baumgartner, L. & Willemsen, P. (ms). Tracing Thick Concepts Through Corpora.

Metaethics and Experimental Philosophy:
A Journey Through Thick and Thin

Pascale Willemsen


Reuter, K., Baumgartner, L. & Willemsen, P. (ms). Tracing Thick Concepts Through Corpora.

Commentary: Testing the Loaded Side of Language

Bianca Cepollaro

Experimental philosophy is an interdisciplinary approach to philosophical questions and problems that uses empirical methods from various cognitive-scientific disciplines such as psychology, experimental linguistics, and neurosciences. Even though experimental philosophy is a relatively recent movement and has only been around for 25 years, its practitioners have shown remarkable productivity. In January 2021, roughly 2000 papers were listed as ‘Experimental Philosophy’ on Roughly one-fourth of these papers were categorised as ‘Ethics’. In this article, I would like to focus on one specific sub-discipline of moral philosophy that, I argue, can benefit greatly from engaging with experimental philosophy, namely metaethics.
Welcome to the Brains Blog’s Symposium series on the Cognitive Science of Philosophy! The aim of the series is to examine the use of methods from the cognitive sciences to generate philosophical insight. Each symposium is comprised of two parts. In the target post, a practitioner describes their use of the method under discussion and explains why they find it philosophically fruitful. A commentator then responds to the target post and discusses the strengths and limitations of the method.
Concepts such as ‘rude’, ‘cruel’, ‘friendly’, and ‘compassionate’ are what philosophers call thick ethical concepts. They are characterised by their provision of both evaluative and descriptive content. They communicate that an action, behaviour, custom, person, or character trait is viewed with approval or disapproval, and they further communicate in virtue of what descriptive features they are evaluated in this way. In contrast, thin ethical concepts, such as ‘good’ and ‘bad’, are said to be merely evaluative. With such a rough-and-ready notion of thick concepts in mind, philosophers have sought to provide a proper definition for these concepts and spell out more clearly how thick concepts differ from thin concepts, as well as from descriptive concepts, such as ‘green’ or ‘round’.
* * * * * * * * * * * *
The way these notions are applied is determined by what the world is like (for instance, by how someone has behaved), and yet, at the same time, their application usually involves a certain valuation of the situation, of a person or actions. Moreover, they usually (though not necessarily directly) provide reasons for actions.
The Location Question asks where exactly we can find the evaluative dimension of a thick term or concept. The two options discussed are (a) that the evaluation is part of the semantic content of a thick term or (b) that it belongs to what is pragmatically conveyed beyond what is literally said. The philosophical literature often relies on intuitions about whether statements such as ‘Tom is cruel, but he is not bad’ sound contradictory. If such a statement does sound non-contradictory, we can conclude that a negative evaluation of Tom is not intrinsic to saying that he is cruel and, thus, not semantically conveyed. Despite the widely accepted relevance of such ‘linguistic data’ (that is the author’s own intuitions about what linguistic intuitions most people have), no systematic empirical studies have been conducted on thick concepts and their evaluative dimension. This is even more surprising, given that in experimental linguistics, empirical means to test whether a statement sounds contradictory are readily available. Perhaps the most widely used test is the Cancellability Test: take a bit of information for which it is unclear whether it is merely conversationally implicated or semantically connected to a concept or sentence, then explicitly cancel this bit of information and see if the resulting phrase sounds contradictory.
Our study yielded surprising results. First, neither the prediction of the pragmatist view nor the semanticist view were met. Against the pragmatists’ prediction, the evaluation of a thick concept was significantly harder to cancel than the conversationally implicated content and resulted in higher contradiction ratings. This effect persisted for two different embeddings of thick terms (Behaviour and Character). Challenging the semanticist, the evaluation of thick concepts was significantly easier to cancel than the semantically entailed content.

As Sayre-McCord (2014) said, ‘Metaethics is the attempt to understand the metaphysical, epistemological, semantic, and psychological presuppositions and commitments of moral thought, talk, and practice’. Central to metaethics are, among others, questions about the meaning of ethical terms such as ‘good’ and ‘bad’, whether moral statements containing these terms can be true or false, what it is that people express or do by using moral language, and how moral language relates to motivation and behaviour. It is clear that these questions demand at least partly empirical answers. It seems absurd to claim that we can properly understand the meaning of ethical terms and what people express and do with them without looking at the way people actually talk. Additionally, it would be highly questionable to make any claims about moral language and its relationship to motivation and subsequent behaviour without consulting or conducting empirical studies.


Cepollaro, B., F. Domaneschi, and I. Stojanovic. 2020. “When is it ok to call someone a jerk? An experimental investigation of expressives.” Synthese.
Willemsen, P., Reuter, K. (2021). Separating the Evaluative from the Descriptive: An Empirical Study of Thick Concepts. Thought: A Journal of Philosophy.
Independent of the linguistically-driven debate in metaethics, it has been argued that thick concepts possess an important connection to actions. This is what I have called the Action-Guidingness Question. Take as an example a friend telling you that your behaviour at the party last night was rude. In addition to simply communicating her disapproval, you might infer an even more far-reaching communicative goal: your friend does not want you to behave in the same way at the next party. What she tries to do is make you change your behaviour. Bernard Williams (1985) offered one of the earliest and most influential attempts to define thick concepts in terms of their potential to guide actions:
This is to say that these (and similar) studies should be seen as a preliminary exploration into a relatively unknown domain and – because of the subtleties of the matter – we should expect to often run into similar hermeneutical aporias. I’ve found myself in a similar situation too. In How Bad Is It to Report a Slur? (2019), together with psychologist Simone Sulpizio and philosopher Claudia Bianchi, we looked at how slurs are perceived in reported speech. Scholars disagree on whether a speaker who reports a slurring utterance is herself engaging in slurring (take an utterance like “My boss said that they aren’t going to hire a S” – where S is a slur, for instance a racial or homophobic epithet). This question is interesting for a bunch of reasons. First, it is a clue for understanding how slurs encode their pejorative content: is it semantically encoded or rather pragmatically conveyed? Second, whether slurs can be reported without being derogatory affects our online and offline language policies. Now, when armchair philosophers and linguists have examined their own intuitions, they came to diverging conclusions. According to some, when a speaker reports a slurring utterance, they – and not necessarily the reported speaker – are perceived as slurring. For these scholars, slurs should be banned not only from direct but also from reported speech (let’s call them prohibitionists; see Anderson and Lepore, 2013; Anderson, 2016). According to others, only the reported and not the reporting speaker is taken to be slurring; slurs don’t need to be banned from reported, but only from direct speech (let’s call them non-prohibitionists; see Schlenker 2007).
Bowers, J.S., and C.W. Pleydell-Pearce. 2011. “Swearing, euphemisms, and linguistic relativity.” PloS one 6, no. 7: e22341.
* * * * * * * * * * * *
The results suggest that in contexts of self-reflection when an individual is attempting to determine the best course of action, thick terms strongly count in favour of or against an option. Descriptive terms do not share this disposition in this context. It seems that philosophers have been right all along in their assmption that thick concepts have the potential to guide actions.
I share Pascale’s enthusiasm for the new insights that an empirical approach to the philosophical study of language can offer, especially in the domain of what we may call loaded language, i.e. speech that does not only describe the world, but evaluates it. Within this broad field, there are entire areas of investigation that have been explored only relatively recently on theoretical grounds, let alone on experimental ones. As a matter of fact, the domain of loaded language encompasses not only moral discourse – on which Pascale’s post focuses – but also expressive speech, ranging from insults (jerk, bastard), interjections (shit!, fuck!), intensifiers (damn, fucking), to slurring terms, that is, derogatory words that target groups and individuals on the basis of their belonging to a certain category (think of racial and homophobic epithets, for instance). Many crucial questions arise around expressive discourse: what expressive speech is, how it works, how expressive content is encoded in language, what functions it fulfils with respect to the speaker and to their audience, in what relation it stands to morality (if any), when it should be censored (if ever).
Second, going beyond the philosophical dispute and each side’s respective predictions, we assumed that polarity (positive vs. negative) might play a role in how simple it would be to cancel the evaluation of a thick concept. We have not seen any suspicion along these lines in the metaethical literature, but given what we know from the experimental philosophy of morality, this was a possibility worth exploring. Our study revealed a strong polarity effect on contradiction ratings. For positive thick terms, contradiction ratings were significantly lower than those of negative thick terms as well as semantic entailments. This polarity effect is hitherto unknown and has not been predicted by any of the various accounts of thick concepts. In fact, the effect challenges the tacit assumption that thick terms and concepts form a homogenous group of which we can ask broad questions about separability and how evaluation and description are connected.
Williams, B. (1985). Ethics and the Limits of Philosophy, Cambridge, MA: Harvard University Press.
Schlenker, P. 2007. “Expressive presuppositions.” Theoretical Linguistics 33, no. 2: 237–45.
Anderson, L., and E. Lepore. 2013. “Slurring words.” Nous 47, no. 1: 25–48.