Translational English Corpus (TEC)

The Translational English Corpus (TEC) is a corpus of contemporary translational English: it consists of written texts translated into English from a variety of source languages, European and non-European. It was set up and is currently managed by Professor Mona Baker at the Centre for Translation and Intercultural Studies. The custom-made software for processing the corpus, which is downloadable from the web, is designed by Dr. Saturnino Luz, University of Edinburgh, who is also in charge of maintaining the corpus.


What does TEC consist of?

TEC consists of four subcorpora: fiction, biography, news and inflight magazines. The overall size of the corpus is currently around ten million words. It can be accessed freely via the web, using a custom-built concordancer designed by Dr. Saturnino Luz.

TEC – Contents

TEC is meticulously documented in terms of extralinguistic features such as gender, nationality and occupation of the translator, direction of translation, source language, publisher of the translated text, etc. This information is held in a separate header file for each text.

TEC – Sample Header File

The concordancing software is designed to make the information in the header file available to the researcher at a glance.

TEC tree - democracy

Software tools

TECThe TEC concordancing tool is a corpus browser that uses Java™ Web Start technology. It is freely available online at:

Alternatively, TEC can be accessed via the Genealogies project interface. Having downloaded and launched the corpus browser, go to ‘File’->’New remote corpus…’ and enter as the IP address of the new corpus server.


What type of research does TEC support?

TEC has supported a broad range of studies in two main areas: the way in which the patterning of translated text might be different from that of non-translated text in the same language, and stylistic variation across individual translators.