Conceptual Analysis and Thematic Corpora: Theory, Methodology and Indicative Case Studies

A webinar organized by the Genealogies of Knowledge Research Network, in collaboration with Aston University, UK

Date: Thursday 12th November

Time: 09.45-13.00 (GMT)

Venue: Blackboard Collaborate Ultra

This event has now taken place. Recordings of each of the talks can be found below.

About the event

Despite longstanding interest in the study of concepts across many disciplines and the phenomenal growth in corpus-based studies since the late 1980s, very little has been published on the intersection of these two, broad areas of scholarship. Much recent work in conceptual history continues to rely on the close textual analysis of a relatively limited set of mainly print resources, for instance to chart the evolution of genius in eighteenth-century Britain (Townsend 2019), or the process by which Persian jins/genus came to mean ‘sex’ (Najmabadi 2013). Such work could greatly benefit from the application of corpus techniques, if resources for the analysis of concepts were easily accessible. However, the construction of most available corpora in fields as varied as linguistics, translation studies and public health has been based on criteria such as genre, register variation or medium (mainly spoken vs written). Other popular compilation criteria include setting (e.g. ECPC corpus of European Chambers texts; Calzada Pérez 2017), authorship, gender (e.g. the Women Writers Online corpus), or broad areas of practice such as medicine or law. The problem with using such resources for conceptual analysis is that the key concepts that shape and frame human experience travel across registers, media, settings and genres. In addition, most diachronic and historical corpora compiled to date, like the Corpus of Early Modern English Medical Texts and the Old Bailey Corpus, tend not to incorporate the multilingual and translational perspective necessary to capture processes of language contact and change. Thus, while offering valuable resources within specific disciplinary perspectives, most existing corpora do not readily support studies on the evolution or contestation of key concepts in social and political life, which require access to corpora designed primarily with thematic criteria in mind.

What are thematic corpora? How should they be built, and what kind of research do they facilitate? In line with the remit of the Genealogies of Knowledge (GoK) Project and Research Network, this event aimed to stimulate interest in corpus-based conceptual analysis, particularly in relation to translation and other forms of mediation. The GoK corpora are being compiled with the specific aim of capturing the evolution and contestation of keywords pertaining to the body politic and to the domain of scientific expertise. They are designed to be used across the humanities, and to inspire complementary efforts involving other languages and knowledge domains. This webinar featured contributions by Felix Berenskoetter (SOAS University of London) and Alison Sealey (University of Lancaster) to the theoretical or methodological dimensions of this research agenda, complemented with case studies by Henry Jones (Aston University), Jan Buts (Trinity College Dublin) and Luis Pérez-González (University of Manchester) that demonstrate the theory and methodology in action.


Recordings of all of the talks presented at the webinar can be viewed here. For colleagues who cannot access YouTube, alternative versions can be found here.

Felix Berenskoetter (SOAS University of London)

The politics, and the political, of concept analysis

Abstract: While the political nature (or quality) of concepts is readily acknowledged and central to, for instance, Koselleck’s notion of ‘basic concepts’, how can we capture this nature (or quality)? And how do we deal with the implications? Taking these questions seriously and addressing their reflexive and analytical dimensions poses significant challenges for scholars. My talk tries to find a way through this challenge in two steps. First, I attempt to capture the political nature of a concept by thinking systematically about how concepts exercise power, which requires dealing with the fact that power itself is a basic concept. Second, I address the reflexive challenge by asking whether it actually is possible to trace, compare, and translate a concept without participating in the shaping of the ‘terms of political discourse’ (Connolly). If we cannot escape this, it follows that all concept analysis, regardless of its approach and scope, is also an act of politics. Taken together, this leads me to call for greater attention to the sliding tension between treating a concept as a category of analysis, as an object of analysis, and as a category of political practice.

Speaker bio: Felix Berenskötter is Head of Department and Senior Lecturer in International Relations at SOAS University of London. His research interests include theorizing world politics through concepts of identity, friendship, security and power. Felix published widely on these topics, including the edited volume ‘Concepts in World Politics’ (Sage 2017), and is currently writing a book on friendship and estrangement in international relations.

Alison Sealey (University of Lancaster)

Constructing a thematic corpus: methodological challenges

Abstract: As this event attests, ‘the key concepts that shape and frame human experience travel across registers, media, settings and genres’. Therefore, the compilation of a thematic corpus presents particular methodological challenges. To illustrate some of these, my talk will invite consideration of how non-human organisms are referenced in human language, and how this reflects the ways they feature in human experience and perception. I will draw on my experience as Co-Investigator on the project ‘People’, ‘Products’, ‘Pests’ and ‘Pets’: the discursive representation of animals, (funded by the Leverhulme Trust 2013-2017), for which one of the data strands entailed the compilation of a digital corpus of language about animals. Among the challenges this raised was the identification of sources of texts from different genres that feature ‘animals’, which meant that our approach necessarily contrasted with that of many corpora compiled for discourse analysis, as these tend to use selection criteria such as genre, authorship or setting. Furthermore, while explorations of particular topics typically use a predetermined set of core linguistic items to identify texts for inclusion in the corpus, an issue we confronted was that the concepts denoted by terms such as ‘animal’ and ‘species’ are themselves part of the enquiry we embarked on. In my talk, I will explain how we set about selecting texts for inclusion in the corpus, and how we sought to avoid the potential circularity inherent in the enterprise.

Speaker bio: Alison Sealey is Professor Emerita of Applied Linguistics at Lancaster University, UK. She has published extensively on a wide range of subjects, with an emphasis on the role of discourse in representations of the social world. Among her recent publications are ‘Translation: a biosemiotic/more-than-human perspective’ (Target. International Journal of Translation Studies 31/3); ‘Animals, animacy and anthropocentrism’ (International Journal of Language and Culture 5/2); ‘”What do animals mean to you?”: naming and relating to non-human animals’ (with Nickie Charles, Anthrozoos 26/4); ‘First catch your corpus: methodological challenges in constructing a thematic corpus’ (with Chris Pak, Corpora 13/2); ‘The Discursive Representation of Animals’ (with Guy Cook, The Routledge Handbook of Ecolinguistics). She continues to research the implications of the ways we talk about the non-humans with whom we share the planet.

Henry Jones (Aston University) & Jan Buts (Trinity College Dublin)

Non-translation of political concepts in online activist discourse

Abstract: Genealogies of Knowledge was a four-year AHRC-funded project, based at the University of Manchester, UK (2016-2020). This case study reports on research exploring the project’s Internet Corpus, a thematically curated collection of English-language activist blogs, digital magazines and other online sources built to facilitate the investigation of two interconnected constellations of key political and scientific concepts (including democracy, citizenship, expertise and evidence) and the ways in which these are being contested in the internet age. The study focuses on the use of untranslated loanwords, particularly the political loanword demos, borrowed into English through Latin from the classical Greek δῆμος, asking not only who employs this word and how they deploy it, but also with what purpose they do so. Some specific uses of demos can be linked to left-wing activist discourse, and even in this restricted context the corpus data reveal intriguing discrepancies in the translations offered by different authors as glosses for demos. We also find significant differences in the meanings attributed to this word in English in comparison with how it appears to have been understood in its original Greek context. Finally, we argue that the primary rationale for this lexical choice can be linked with ongoing attempts to open up a new interpretative space for radical reinterrogations and retranslations of the concept of democracy in and for the contemporary public sphere.

Speaker bios: Henry Jones is a lecturer in translation and intercultural studies at Aston University, UK. He is a co- coordinator of the Genealogies of Knowledge Research Network and co-editor of the Routledge Encyclopedia of Citizen Media. His current research interests include corpus-based translation studies, translation history, media theory and online translating communities.

Jan Buts is a postdoctoral researcher attached to the QuantiQual project at Trinity College Dublin, and a co-coordinator of the Genealogies of Knowledge Research Network. He works at the intersection of translation theory, conceptual history, corpus linguistics, and online media.

Luis Pérez-González (University of Manchester)

Credentialed vs. Grounded Expertise in the Climate Change Blogosphere: Drawing on Thematic Corpora for the Study of an Ongoing Epistemic Shift

Abstract: Unlike general corpora compiled to facilitate the study of linguistic structures or genre-specific discursive conventions, the Genealogies of Knowledge Internet corpus is intended to yield insight into discourses of confrontation and resistance mobilised by key cultural concepts in digital culture. This presentation reports on a study of how established forms of scientific expertise are challenged in the Anglophone climate science blogosphere drawing on the Climate Science Blogger Corpus (CSBC), a subsection of the Genealogies of Knowledge Internet corpus. Consisting of blog posts written by journalists, academics, activists, activist scientist and lobbyists, texts included in CSBC are representative of the increasing politicisation and polarisation around climate science, a site of controversy where epistemological discussions on the quality of the science are difficult to separate from questions of scientific knowledge construction. The presentation focuses on two methodological aspects of the study. First, it accounts for the adoption of Studies of Experience and Expertise as the theoretical framework driving the selection of publication outlets and authors, and ensuring that the complexity of views on scientific expertise at the heart of the climate change blogosphere are represented in CSBC. Second, it illustrates how the systemic-functional notion of dialogic engagement enables the identification of lexical markers of confrontation between holders of established forms of expertise and those advocating new epistemic frameworks of environmental governance.

Speaker bio: Luis Pérez-González is Professor of Translation Studies and Co-Director of the Centre for Translation and Intercultural Studies at the University of Manchester, UK. Between 2016 and April 2020, he was a member of the team working on the AHRC-funded project entitled Genealogies of Knowledge: The Evolution and Contestation of Concepts across Time and Space. The study that this presentation reports on was published on 15 September as part of an open-access special issue of Humanities and Social Sciences Communications that can be accessed here.


Calzada Pérez, M. (2017) ‘Corpus-based Methods for Comparative Translation and Interpreting Studies’, Translation and Interpreting Studies 12(2): 231-252.

Najmabadi, A. (2013) ‘Genus of Sex or the Sexing of Jins’, International Journal of Middle East Studies 45: 211-231.

Townsend, D. (2019) ‘On Genius: The development of a philosophical concept of genius in eighteenth-century Britain’, Journal of the History of Ideas 80(4). Available at