Jun
6
crm.py
Mon, 2005-06-06 17:48
This is a simple Python wrapper class for the CRM114 Discriminator (which does Bayesian filtering, amongst many other things).
Requires the crm command to be installed and in your command path. You download crm here, and I’ve written some simple build instructions which may help, particularly if you are trying to build it on MacOS X.
The latest version of this file can be obtained from the Elegant Chaos subversion server (user=guest, pass=guest) at http://source.elegantchaos.com/projects/com/elegantchaos/libraries/python/crm.py.
The module provides a very simplified interface to crm114. It does not attempt to expose all of crm114’s power, instead it tries to hide almost all of the gory details.
To use the module, create an instance of the Classifier class, giving it a path (where to store the data files), and a list of category strings (these are the “labels” to classify the text with).
c = Classifier("/path/to/my/data", ["good", "bad"])
To teach the classifier object about some text, call the learn method passing in a category (on of the ones that you provided originally), and the text.
c.learn("good", "some good text")
c.learn("bad", "some bad text")
To find out what the classifier things about some text, call the classify method passing in the text. The result of this method is a pair - the first item being the category best matching the text, and the second item being the probability of the match.
(classification, probability) = c.classify("some text")
- Login to post comments
