Genealogies of Knowledge Corpus
The Genealogies of Knowledge corpus is designed to enable researchers to trace the trajectory of key concepts as they enter different cultural and temporal spaces, predominantly but not exclusively through the mediation of various forms of translation.
The current focus of the project is on three historical lingua francas (Arabic, Latin and English) and on concepts relating to the body politic and to scientific, expert discourse. However, the research team is developing a set of resources and a range of methodologies that can support future studies involving different lingua francas (French being an obvious choice in the European context), different historical moments (perhaps the Enlightenment), and different constellations of concepts. To this end, both the corpora we are building and the software being developed to interrogate them and visualise the findings are being made accessible to the research community, to support other types of study.
Because of legal constraints pertaining to copyright law, we offer restricted access to the corpora: we aim to allow visitors to the site to run searches, expand individual concordance lines within the limits of fair use (300 words), and download the findings. But we are unable to offer full access to individual texts.
Content of the Corpus
Temporally, the corpus is designed to allow the research team to examine the following processes, within specific historical and spatial locations:
- the mediation of Greek thought through translations into and commentaries in Arabic, from the eighth through to the tenth century;
- its renegotiation via translations into and commentaries in Latin, either directly or via Arabic in the eleventh, twelfth and thirteenth centuries;
- the renegotiation of the key concepts under study in translations of key texts into English in the late nineteenth century and throughout the twentieth and early twenty-first centuries;
- the ongoing renegotiation of the concepts in question by civil society organisations and actors in the twenty-first century, particularly on the Internet.
The corpus is broadly divided into three sections, or sub corpora:
- a corpus of Greek source texts;
- translations into medieval Arabic;
- medieval Arabic commentaries on Greek texts;
- translations and retranslations into Latin, from both Arabic and Greek;
- Latin commentaries on Greek texts.
Modern sub corpus:
- translations and retranslations of relevant texts into English in the nineteenth and throughout the twentieth and twenty-first centuries, primarily from Greek, Latin, French and German;
- Internet discourse in English produced by alternative media and news outlets, such as Indymedia, Inter Press Service, Open Democracy and ROAR (Reflections on a Revolution) Magazine, as well as civil society organisations such as The World Social Forum. This corpus draws on discourses generated and disseminated by communities that advocate and practise alternative forms of political participation and provide new platforms for the collective revision and construction of knowledge.
Supported by the powerful search and visualisation software tools developed specifically for this project, the corpus is designed to allow the research team, and the research community at large, to trace the development and mutation of key concepts that have become a core part of our academic and public life, and their contestation and renegotiation by civil society today.
Using the corpus
Guidance on how to use the corpus can be accessed here:
- Guidance on using the Premodern corpus
- Guidance on using the Modern English corpus
- Guidance on using the Internet corpus
Corpus text preparation
Documentation on the process of preparing texts for uploading to the corpus can be downloaded by clicking on the links below:
- Instructions on preparing texts for the Premodern subcorpus
- Instructions on preparing texts for the Modern subcorpus
- Instructions on preparing texts for the Internet subcorpus
- Instructions on using Regular Expressions
To view the latest version of the .dtd files used to annotate the corpus texts, please click on the following links: