Text classification software (link)

(1. Pre-Processing -> 2. Dictionary -> 3. Data Processing -> 4. Learning a model)

Pre-processing building a .sp file, namely a processing chain.

 e.g.: [ORIGINAL] Bob is in the kitchen. I don't care at all!!!
[AFTER PROCESSING] Bob_NP Bob_NP_be_VBZ be_VBZ be_VBZ_in_IN ...

Dictionary computation building a dictionary from a .sp file and a corpus of documents

Data Processing building a numerical corpus from a .sp file, a dictionary and a corpus of documents. All documents are projected to build a bag of words.

Learning step learning (and saving) a model associated to a labeled set.

Testing a model Test a model in console mode

Take care

In all above programs, the existing files are not replaced. If a file exists, the associated computation is not done.


Download:

package with all binaries & scripts

  1. Download and unpack the file in a directory
  2. Launch all scripts

sources & lib