Project News
Publications
Events
Other News
Interviews
Multimedia
Resources
Genealogies of Knowledge
  • Home
  • About the Network
  • GoK Project
  • GoK Corpus
  • Credits
  • Software
  • Network Events
  • SHE Corpus
  • TEC

Corpus text preparation

Overview Corpus design Corpus contents Corpus text preparation Research avenues Project team Past events

 

Documentation on the process of preparing texts for uploading to the corpus can be downloaded by clicking on the links below:

  • Instructions on preparing texts for the ancient Greek, Latin and medieval Arabic subcorpora
  • Instructions on preparing texts for the Modern English subcorpus
  • Instructions on preparing texts for the Internet English subcorpus
  • Instructions on preparing texts for the Modern Arabic subcorpus
  • Instructions on using Regular Expressions 

To view the latest version of the .dtd files used to annotate the corpus texts, please click on the following links:

  • goktext.dtd (19 June 2017)
  • gokheader.dtd (29 Nov. 2018)
Gok Web Interface
User Manual (Web Interface)
GoK Tool (Desktop Version)
User Manual (Desktop Version)
Explore the Contents of the Genealogies of Knowledge Modern English Corpus

Recent Posts

  • Power, Biopolitics, and Women’s Bodies: A Corpus-Based Study of Texts about Women’s Reproductive Health and Their Korean Translations

    10/03/2026
  • From Style, through Ethics, to the Political: A Journey with Mona Baker

    10/03/2026
  • Critical discourse analysis in translation studies: An introductory textbook

    28/08/2025

Categories

  • Articles
  • Books
  • Events
  • Interviews
  • Multimedia
  • Other News
  • Project News
  • Publications
  • Resources
  • Uncategorized

Archives

  • 2026
  • 2025
  • 2024
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016

© 2023 Genealogies of Knowledge