Corpus design

Overview Corpus design Corpus contents Corpus text preparation Research avenues User manual


In terms of time span, the Genealogies corpus is designed to allow researchers to examine the following processes, within specific historical and spatial locations:

  • the mediation of Greek thought through translations into and commentaries in Arabic, from the eighth through to the tenth century;
  • its renegotiation via translations into and commentaries in Latin, either directly or via Arabic, during two periods: the Classical period, including Late Antiquity, c.1st century BC to 6th century AD; and the Medieval translation movement from the thirteenth century onwards;
  • the renegotiation of the concepts under study in translations of key texts into English in the late nineteenth century and throughout the twentieth and early twenty-first centuries;
  • the ongoing renegotiation of the concepts in question by civil society organisations and actors in the twenty-first century, particularly on the Internet.

The Genealogies corpus consists of a number of subcorpora:

  • a corpus of Greek source texts (c.5th century BC to c.2nd/3rd century AD) which features authors such as Galen, Hippocrates, Plato, Aristotle and Isocrates;
  • a corpus of medieval Arabic consisting of translations of and commentaries on Greek texts, including translations by Hunayn Ibn Ishaq of texts by Hippocrates and Galen, as well as original texts by Al-Farabi, Averroes and Avicenna;
  • a corpus of Latin consisting of original texts by authors such as Cicero and Boethius, translations and retranslations from both Arabic and Greek, as well as commentaries on texts by authors such as Aristotle;
  • a corpus of modern English consisting of translations and retranslations of relevant texts in the nineteenth and throughout the twentieth and twenty-first centuries, primarily from Greek, Latin, French and German. It features several retranslations of works by Greek and Roman authors such as Plato, Thucydides and Plutarch as well as translations of works by more modern authors such as Ludwig Wittgenstein, René Descartes, Karl Popper, Karl Marx, Michel Foucault, Étienne Balibar and Hegel;
  • a corpus of Internet English produced by alternative media and news outlets. Alternative media sites on the left include outlets such as Indymedia, Inter Press Service, Open Democracy and ROAR (Reflections on a Revolution) Magazine, as well as texts from the websites of civil society organisations such as The World Social Forum. Alternative sites on the right include Newsmax, and News Rescue. Blogs include I Cite and Bad Science on the left, and Guido Fawkes and Climatedepot on the right. This corpus draws on discourses generated and disseminated by communities that advocate and practise alternative forms of political participation and provide new platforms for the collective revision and construction of knowledge.

Supported by the powerful search and visualisation software tools developed specifically for this project, the corpus is designed to allow the research team, and the research community at large, to trace the development and mutation of key concepts that have become a core part of our academic and public life, and their contestation and renegotiation by civil society today.