I’ve been thinking about this for a while and have decided to finally write about it. This is basically an open letter to the people that write email clients and/or spam filtering services or software.
Spam is an issue that effects everyone I know – even my mom sometimes calls me and asks what to do with the emails that she doesn’t understand why she received.
I’ve been looking for a good spam solution for the past couple of years, but to be honest, I haven’t really found anything that I’m 100% happy with.
I think a big part of the problem is simply the fact that all email clients that I’m aware of define spam as black or white. While it’s usually obvious to a human eye if an email is spam, to a computer program, it’ll always be shades of grey.
What usually happens when a computer tries to determine if an email is spam is that it goes through a number of “tests”. Think of it as checklist that the computer is going through – like a 100 point inspection given to a used car (some questions are worth more than others). At the end of this process, the computer gives the the email a “spam score”.
Lets say a score of zero means that the email is for sure not spam and a score of 100 means the email is for sure spam. In reality, very few emails get a score of 100. Spammers are getting better at “passing the tests”. What happens is that the computer decides that if an email is above a certain threshold, lets say 80, then the email is flagged as spam and goes into your “spam box”.
Some of the better spam services do indeed allow you to define how aggressive you want them to be – in other words, what number to use for the threshold that splits between spam and non spam.
The problem is that if you aren’t aggressive enough, to many spam emails are not getting tagged as spam (called false negatives) and if you are to aggressive, you start tagging non spam emails as spam (false positives).
My proposal is to simply:
- Expose the internal spam score to the user.
- Be able to sort by the spam score column.
That way, if I receive 100 emails a day which the computer thinks is spam, I don’t really have to skim through all 100 to make sure I didn’t miss anything. I can just check the ones that are borderline and go ahead and delete the rest.
Of course, this does give spammers a tool which makes it easier for them to check how “spammy” an email is, but they’d be able to check in any case.