Jun
7
Feed Me
Tue, 2005-06-07 16:55
What Is It?
Feed Me is a Bayesian filtering system for RSS news feeds.
What Does It Do?
RSS news feeds are great, but inevitably one ends up subscribing to too many of them, and drowning in a sea of unread articles. So if you've got 400 unread articles, and only time to read 10, what do you do?
Most news readers will allow you sort unread articles alphabetically, or by date, which might let you get the 10 newest articles from your favourite feed.
That's not really what you want though - what you really want to do is pull out the 10 articles that you're most likely to find interesting, from all of the feeds you are subscribed to.
This task is what Feed Me attempts to accomplish, by allowing you to rate articles as good or bad, and then helping you to find more of the good ones.
How Is It Implemented?
It is implemented in Python, as a CGI.
Where Can I Get It?
The project is in an early stage of development, and isn't ready for prime time, but you can look at the source code if you want to (user=guest, password=guest).
All comments, suggestions, code submissions welcome!
- Login to post comments
Jun
7
Bayesian Filtering With Python
Tue, 2005-06-07 11:20
I've been playing around with Bayesian filtering and Python.
One of the results is this little Python library which provides a very simple wrapper for the CRM114 Discriminator.
I've also written a few additional notes on how to build CRM114 for MacOS X.
- Sam Deane's blog
- Login to post comments
Jun
6
crm.py
Mon, 2005-06-06 17:48
This is a simple Python wrapper class for the CRM114 Discriminator (which does Bayesian filtering, amongst many other things).
Requires the crm command to be installed and in your command path. You download crm here, and I’ve written some simple build instructions which may help, particularly if you are trying to build it on MacOS X.
The latest version of this file can be obtained from the Elegant Chaos subversion server (user=guest, pass=guest) at http://source.elegantchaos.com/projects/com/elegantchaos/libraries/python/crm.py.
The module provides a very simplified interface to crm114. It does not attempt to expose all of crm114’s power, instead it tries to hide almost all of the gory details.
To use the module, create an instance of the Classifier class, giving it a path (where to store the data files), and a list of category strings (these are the “labels” to classify the text with).
c = Classifier("/path/to/my/data", ["good", "bad"])
