Project News
Publications
Events
Other News
Interviews
Multimedia
Resources
Genealogies of Knowledge
  • Home
  • About the Network
  • GoK Project
  • GoK Corpus
  • Credits
  • Software
  • Network Events
  • SHE Corpus
  • TEC

Corpus text preparation

Overview Corpus design Corpus contents Corpus text preparation Research avenues Project team Past events

 

Documentation on the process of preparing texts for uploading to the corpus can be downloaded by clicking on the links below:

  • Instructions on preparing texts for the ancient Greek, Latin and medieval Arabic subcorpora
  • Instructions on preparing texts for the Modern English subcorpus
  • Instructions on preparing texts for the Internet English subcorpus
  • Instructions on preparing texts for the Modern Arabic subcorpus
  • Instructions on using Regular Expressions 

To view the latest version of the .dtd files used to annotate the corpus texts, please click on the following links:

  • goktext.dtd (19 June 2017)
  • gokheader.dtd (29 Nov. 2018)
Gok Web Interface
User Manual (Web Interface)
GoK Tool (Desktop Version)
User Manual (Desktop Version)
Explore the Contents of the Genealogies of Knowledge Modern English Corpus

Genealogies of Knowledge Research Network Follow

Corpora: https://t.co/IWfvgo7vmR Research: https://t.co/uOdGdklYRW… Latest project: https://t.co/FyW7VgTNr0

GenealogiesR
genealogiesr Genealogies of Knowledge Research Network @genealogiesr ·
21 Dec

Webinar👁️: Translational Perspectives on Corpus-Based Conceptual Analysis. 🕰️ Attend on 26 March 2025, 12:00-15:40 UTC. ✍️Register now: https://nettskjema.no/a/477242. More info: https://genealogiesofknowledge.net/research-network-events/translational-perspectives/

Reply on Twitter 1870492220484403210 Retweet on Twitter 1870492220484403210 9 Like on Twitter 1870492220484403210 10 Twitter 1870492220484403210
Retweet on Twitter Genealogies of Knowledge Research Network Retweeted
genealogiesr Genealogies of Knowledge Research Network @genealogiesr ·
25 Sep

At @CentreShe, we are organizing a webinar on corpora and the study of scientific discourse. 🦠All welcome! Info and registration here: https://tinyurl.com/bdhjfs3y

Reply on Twitter 1838918962744086841 Retweet on Twitter 1838918962744086841 5 Like on Twitter 1838918962744086841 7 Twitter 1838918962744086841
Retweet on Twitter Genealogies of Knowledge Research Network Retweeted
dr_ammar_azzouz ammar azzouz 🥀 عمّار عزّوز @dr_ammar_azzouz ·
25 Sep

A new scholarship by the @UniofOxford for Palestinians in Gaza and West Bank. Please consider applying or sharing with those interested.

scholarship will cover course fees, a grant for living costs, as well as additional support towards arrival costs.

https://www.ox.ac.uk/news/2024-09-19-oxford-university-support-students-and-academics-gaza-and-west-bank

Reply on Twitter 1838887635298800028 Retweet on Twitter 1838887635298800028 1034 Like on Twitter 1838887635298800028 1227 Twitter 1838887635298800028
Retweet on Twitter Genealogies of Knowledge Research Network Retweeted
eivinden Eivind Engebretsen @eivinden ·
25 Sep

MA students across @CircleU_eu can now apply for an Honours Certificate in Sustainable Health! This interdisciplinary supplement is available online and can be pursued alongside your main study program
https://www.circle-u.eu/open-campus/students/courses/university-of-oslo/sustainable-health.html

Reply on Twitter 1839008135316418863 Retweet on Twitter 1839008135316418863 2 Like on Twitter 1839008135316418863 5 Twitter 1839008135316418863
Load More

Recent Posts

  • Webinar: Translational Perspectives on Corpus-based Conceptual Analysis

    18/03/2025
  • Webinar on Corpus-linguistic Approaches to Scientific Discourse

    25/09/2024
  • Community between Horde and Herd: A corpus study

    17/07/2024

Categories

  • Articles
  • Books
  • Events
  • Interviews
  • Multimedia
  • Other News
  • Project News
  • Publications
  • Resources
  • Uncategorized

Archives

  • 2025
  • 2024
  • 2022
  • 2021
  • 2020
  • 2019
  • 2018
  • 2017
  • 2016

© 2023 Genealogies of Knowledge