Software

GoK Tool Plugins User Manual Developers

The corpus analysis software under development for this project is available for anyone to use. This software connects a modnlp-based concordance browser to the most recent update of the Genealogies corpus.

Note: While the full Genealogies corpus is only available via the downloadable desktop version of the concordance browser, the Modern English and Internet English subcorpora are also available via newly developed web interface: https://genealogies.mvm.ed.ac.uk/webcli/ (select ‘Gok English’ via Menu > Selection tools.) A user manual for the web interface is available here: https://www.shecorpus.net/user-manual-web-interface/

To download and install a copy of the desktop tool, please click one of the following links:

Download installation package (Mac version)

Download installation package (Windows version)

Windows users: To complete the installation process, you will need to unzip the folder once downloaded and run the installation by double clicking on the .exe file.

Once the tool has been installed, it should be possible to launch the application in subsequent sessions simply by finding modNLP among the list of programmes installed on your machine.

Mac users: To install and run ModNLP on Mac you will first need to modify your security permissions. Please note that this may not be possible without admin permissions on your computer.

1) Click on the button above to download the installer and extract the .app file.

2) Open a Terminal window. To do this you can press cmd+space to open Spotlight and then type Terminal

3) Copy or type out the following command and hit enter. You may be required to type in
your password:

sudo spctl –master-disable

4) Next, copy or type out the following command. DO NOT HIT ENTER YET. Drag the modNLP
installer icon into terminal window (alternatively type out the path to the installer at the end of the command)

sudo chmod -R 755

For example:

sudo chmod -R 755 /Users/me/Downloads/modNLP.app

5) The software should now install on the downloaded app which you should drag and place in the applications folder

6) Open the Application folder and click on the installed modNLP app. This might pop up a box saying the software is damaged. Don’t worry: the application just needs further permissions

7) Open a terminal (see step 2) and copy or type out the following command. DO NOT HIT ENTER YET. Drag the modNLP app from the Applications folder into the terminal window (alternatively type out the path to the modNLP.app at the end of the command)

xattr -cr

For example:

xattr -cr /Applications/modNLP.app

8) The app should now run.

9) Pin the downloaded app to your launcher for easy access in subsequent sessions.

SourceForge

Additionally, the software code and plugins are available for download at : https://sourceforge.net/projects/modnlp/

Should you encounter any software bugs or other technical problems when using these tools, please create a ticket detailing the nature of the issue on our SourceForge project page: https://sourceforge.net/p/modnlp/tickets/

MODNLP: Modular Suite of NLP Tools

modnlp aims to provide a modular architecture and tools for natural language processing written (mainly) in Java. These tools are being developed in connection with the Genealogies of Knowledge project.

The following modnlp modules are currently available:

idx: an API and tools for (inverted) indexing, storage and retrieval of large amounts of text, with (XML-based) handling of meta-data.
tc: an API and tools for text categorisation, including, functionality for XML parsing, term set reduction (and basic keyword extraction), probabilistic classifier induction, two sample classification tools, and evaluation modules.
tec-tools (v2), consisting of tec-server, a corpus indexer and server for corpus access and analysis over the web and tec-client: a corpus analysis client. Unlike the (now obsolete) version 1 of these tools, originally developed for the TEC project, and written in Perl, C (server side) and Java, the version in this site (v2) is written entirely in Java.

This new version of the tools forms the basis of software support for text analysis and visualisation in the Genealogies of Knowledge project.

The modnlp/tec tools have also been used by the European Parliamentary Comparable and Parallel Corpora project (ECPC) coordinated by Dr. Calzada Pérez (Universitat Jaume I, Spain), and by the Translational English Corpus, which has been collected and maintained under Prof Mona Baker’s supervision at the University of Manchester, and made available on the Internet through the Genealogies of Knowledge project website, in a collaboration between The University of Edinburgh and The University of Manchester.

Also available is the documentation of the modnlp suite (for developers).