Nice logo, eh? It was created by Liz Manicatide, a very nice artist friend as a
commissioned work- she's a hired-gun artist. You can hire her for
artistic, web, and user-interface work as well:
|CRM114 is a system to examine incoming e-mail, system log
streams, data files or other data streams, and to sort, filter, or
alter the incoming files or data streams according to the user's
wildest desires. Criteria for categorization of data can be via
a host of methods, including regexes, approximate regexes, a
Hidden Markov Model, Orthogonal Sparse Bigrams, WINNOW, Correllation,
KNN/Hyperspace, or Bit Entropy ( or by other means- it's all programmable).
Accuracy has been seen in excess
of 99.9 per cent. In other words,
CRM114 learns, and it learns fast .
People have been able to run CRM114 on Linux, BSD, Mac OS-X, and Windows (natively and with Cygwin), and it has even been integrated with Microsoft Outlook and QUALCOMM Eudora. See the "Cool Things" link below for details. I can't help on any of these except Linux, though if you ask on the mailing list, someone might be able to assist you.
Not every user gets great results with the default classifier; that's why CRM114 has several different classifiers available. It's easy to switch classifiers and run a script to see what the tradeoffs are in terms of speed, accuracy, disk space, rate of learning, etc.
You can get at all of these exciting interconnects (including
the Outlook macros) in Cool Things in the wiki.
CRM114 is licensed under the GPLv2; it is WITHOUT WARRANTY of ANY KIND, and although it is now in production on many sites, it will always be in perpetual BETA because the primary mission (antispam) is chasing a moving, actively evading target.
Use at your own risk, and send me bug reports! Or even better, send me improvements! If your code is substantial, I prefer to dual-license the code (i.e. we both get full rights to it, including the right to reuse and relicense under other licenses).