Project News
Publications
Events
Other News
Interviews
Multimedia
Resources
Genealogies of Knowledge
  • Home
  • About the Network
  • GoK Project
  • GoK Corpus
  • Credits
  • Software
  • Network Events
  • SHE Corpus
  • TEC

Corpus text preparation

Overview Corpus design Corpus contents Corpus text preparation Research avenues Project team Past events

 

Documentation on the process of preparing texts for uploading to the corpus can be downloaded by clicking on the links below:

  • Instructions on preparing texts for the ancient Greek, Latin and medieval Arabic subcorpora
  • Instructions on preparing texts for the Modern English subcorpus
  • Instructions on preparing texts for the Internet English subcorpus
  • Instructions on preparing texts for the Modern Arabic subcorpus
  • Instructions on using Regular Expressions 

To view the latest version of the .dtd files used to annotate the corpus texts, please click on the following links:

  • goktext.dtd (19 June 2017)
  • gokheader.dtd (29 Nov. 2018)
Gok Web Interface
User Manual (Web Interface)
GoK Tool (Desktop Version)
User Manual (Desktop Version)
Explore the Contents of the Genealogies of Knowledge Modern English Corpus

Recent Posts

  • Thematic Corpus Construction, Representativeness, and Discursive Sustainability

    26/03/2025
  • Webinar: Translational Perspectives on Corpus-based Conceptual Analysis

    18/03/2025
  • Webinar on Corpus-linguistic Approaches to Scientific Discourse

    25/09/2024

Categories

  • Articles
  • Books
  • Events
  • Interviews
  • Multimedia
  • Other News
  • Project News
  • Publications
  • Resources
  • Uncategorized

Archives

  • 2025
  • 2024
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016

© 2023 Genealogies of Knowledge