Text classification software (link)

(1. Pre-Processing -> 2. Dictionary -> 3. Data Processing -> 4. Learning a model)

Dictionary

Arguments: path to the data, name of the output file, name of the SP file (cf processing step), filter limit (the number of times that a word should appear to be preserved).

Usage:

 java -jar bin/classif2012_createDico.jar -path2data=data/book/ -dicoFile=dico.txt -spFile=SPbasic.sp -nFilter=1

See the download section: here?

A bash script describes the usage script2_createDico.sh