Half a croissant, on a plate, with a sign in front of it saying '50c'
h a l f b a k e r y
Ask your doctor if the Halfbakery is right for you.

idea: add, search, annotate, link, view, overview, recent, by name, random

meta: news, help, about, links, report a problem

account: browse anonymously, or get an account and write.



Distributed spam email filtering

A service to society!
  [vote for,

USENET has a system -- called Spam Cancel -- where a central location (namely, a specific USENET newsgroup) maintains lists of spam posts sent to various newsgroups. Then, special USENET clients can retrieve the list of "spam cancels" from the central location, and suppress the spam posts.

Exactly how this works under USENET is not necessarily important -- but perhaps a similar sort of system can be set up for emailed spam. The basis of the idea is as follows: users who receive spam emails report these emails to a central location; the central location verifies that these emails are, in fact, spam, and then publically posts information that a special mail client or mail server can use to identify that piece of spam, and delete it.

Simple spam email takes the form of one message sent to many different email addresses. Each message therefore contains exactly the same text, and filtering these messages is easy.

More advanced spam mailings contain some unique strings, i.e. a unique URL in each email; however, the bulk of the text will probably remain the same, and will be sufficient for a mail server or client to identify a piece of mail as spam.

Of course, additional criteria can be used by mail servers and clients to identify spam -- for instance, the lack of a "to" address in the headers.

If we are willing to sacrifice human supervision, the entire system can be automated. Flooding of the spam submission system by spammers can be fought, and items can be automatically marked as spam if submitted by enough users.

Of course, compliance from mail clients and/or mail servers will be necessary to make this work. Compliance in mail clients makes more sense to me, because that allows users to choose whether or not to use the spam cancel service. Furthermore, Spam Cancel-enabled clients will (hopefully) allow users to submit pieces of spam to the central anti-spam service, keeping the system going.

illya23b, Jan 06 2001

Mail Abuse Prevention System http://maps.vix.com/
A simliar idea implemented at the ISP-level [confusionary, Jan 06 2001, last modified Oct 04 2004]

Mail Abuse Prevention System http://mail-abuse.org/
Baked, in several flavors. [egnor, Jan 06 2001, last modified Oct 04 2004]

Brightmail http://www.brightmail.com/
Brightmail's server-side filtering components talk to each other. [dgeiser13, Jan 06 2001, last modified Oct 04 2004]

Vipul's Razor http://razor.sourceforge.net/
Implements something like this [krelnik, Oct 19 2002, last modified Oct 04 2004]

CloudMark SpamNet http://www.cloudmark.com/
Commercialized version of the Vipul's Razor technology that runs in Outlook. [krelnik, Oct 19 2002, last modified Oct 04 2004]


       In the case where most of the body of a SPAM email is the same, but a few lines are changed, then maybe technology from the anti-virus world could be carried into this application for determining when there is a "signature match".
mwburden, Jan 07 2001

       Would have to watch for things like people reporting emails as spam in an effort to get someone blocked, kind of a reverse mailbomb.   

       Read a bit of it, and that Brightmail thing looks interesting...
StarChaser, May 27 2001

       I refer you to "Vipul's Razor" and the "Distributed Checksum Clearinghouse". Two _free_ implementations of this very concept.   

       You'll find links to both on Google.
archeus, Feb 27 2002

       Many email clients and servers use bayesian filtering that is based on the notion that everyone has a different idea of what is or is not spam. One assumes that any pre-existing knowledge of spam would risk false positives, destroying desired mail.   

       There are many possible definitions of spam. But there is a perfectly good universal definition of spam: "any content that is sent to addresses that are exclusively acquired by address mining spiders".   

       So the basic technique would be to set up a few new email addresses, and put them in some "spider bait" web pages. The spiders crawl in and bring the goodies back to their nest. The email content gets added to a database of pure spam, to be absorbed by a universal spam filter. The end user would use a combination of the universal filter and the personal filter.   

       One would be tempted to blacklist the senders, but one would need to figure out how to avoid blacklisting spoof victims.   

       In any case, this would be a good fully automated technique to obtain large quantities of pure spam for use in a distributed spam filtering system.
Kennard, Dec 28 2004

       What you describe, Kennard, is precisely how Brightmail's product works to eliminate spam on the server side.
krelnik, Dec 28 2004


back: main index

business  computer  culture  fashion  food  halfbakery  home  other  product  public  science  sport  vehicle