MODNLP/TC: an API and tools for text categorisation

modnlp/tc: an API and tools for text categorisation, including, functionality for XML parsing, term set reduction (and basic keyword extraction), probabilistic classifier induction, two sample classification tools, and evaluation modules. The software is distributed under the GNU General Public License, and is fully compatible with the GNU Classpath It has been tested on a number of JVM’s, including kaffe (v1.1.5), sablevm (v1.1.6), jamvm (v1.3) and JDK 1.4+ The functionality supported by the API include:

