November 2007
Monthly Archive
Spam, delivery, email and more
Monthly Archive
Posted by steve on 28 Nov 2007 | Tagged as: Blocking, Deliverability
… or Why do spam filters sometimes have some very strange ideas?
It’s been dogma for a long time that if you’re doing email marketing you should avoid using a .biz domain in your mails. Even if your main website was in .biz, you should use something different in your messages, perhaps a website you buy solely for use in email that redirects to your real .biz website. Last year I looked at why that was, and what could be done about it.
One main reason for avoiding it has been resolved (so if you’ve been avoiding using .biz URLs in your mail now might be a good time to re-test that decision). And enough time has gone by that I can share the ugly reasons as to why .biz was considered a sure sign of spam without good reason for so long without upsetting everyone.
The simple reason was SpamAssassin. SpamAssassin is very widely used to filter mail, both in it’s open source version and buried anonymously deep inside countless commercial spam filters and filtering appliances. Not only that, but SpamAssassin is readily available, so most people looking to do pre-mailing content checks or looking at why content-based filters are objecting to a particular email will use SpamAssassin as their model. It’s very widely deployed, and influential far beyond the size of it’s deployed base.
SpamAssassin is a score-based spam filter - it checks an email against hundreds of rules, adds up the scores of each rule that matches and, in typical setups, decides the mail is spam if the total score is five or more. Pretty reasonable, but here are a few of the rules and scores (from the 2006 version of SpamAssassin)
You can’t quite treat the scores as SpamAssassins measure of the “spamminess” of a message (”a .biz URL is 23% spammier than hardcore porn” … “The URL microsoft.biz is about as spammy as From: Ignatious T. Aardvark <success@sdfghjkl.com>“) but it’s pretty clear that using a .biz domain in your mail had a huge effect on your SpamAssassin score, and a bad risk to take if you could easily avoid it.
So, was .biz really that spam-ridden? I recall it being pretty bad when it first launched, so it’s reasonable that SpamAssassin has that rule, but was it still bad by 2006? Bad enough to merit a score quite that high? That’s hard to measure, but a reasonable metric is the percentage of domains in each top level domain (.com, .net, .biz etc) that had been spotted as definite spam sign by the folks at SURBL.

So .biz looks just fine - comparable with .com or .net, and certainly a lot better than .info. Why was SpamAssassin still treating it as so spammy?
SpamAssassin developers measure and develop their scores based on several corpuses of recently received email, hand categorised into spam mail and non-spam (”ham”) mail. Like many other spam filters, they stay fairly vague about where exactly these corpuses come from (to avoid people gaming the system) but they seem to be based mostly on the personal mailboxes of developers. Of the five corpuses SpamAssassin were using in 2006, four saw almost no .biz spam, but one saw quite a lot (graph of .biz URLs in spam). More importantly, though, none of them saw more than tiny number of .biz URLs in non-spam(graph of .biz URLs in non-spam).
The algorithm that SpamAssassin uses to assign scores to the rules is complex, but loosely speaking if a rule helps to correctly classify one of the mails in the spam corpus as spam, then the score of that rule will tend to be increased, while if a rule helps to wrongly classify non-spam as spam then the score for that rule will tend to be decreased. In the test corpuses used, .biz URLs hardly ever appear in non-spam, so there’s no pressure to reduce the score assigned to that rule.
So the final answer to the question in the title is:
This leads to a vicious circle where legitimate mailers don’t use .biz as SpamAssassin would punish them for doing so, and SpamAssassin continues to punish anyone using .biz URLs because they’re not used by legitimate mailers. SpamAssassin eventually broke this particular circle by removing the rule from their latest release, but not until it had had a major effect on use of .biz URLs that still persists.
The .biz issue has since been resolved, but there’s a broader deliverability conclusion to draw from this story. While on a branding and image level you want your messages to stand out from all your competitors’ messages, on a technical level you want your mails to be similar to those of other legitimate mailers. That way, if there’s an oddity in a content filter that makes it classify your mail as spam it’ll likely be classifying lots of other legitimate mail as spam too, and be fixed fairly quickly (probably before it’s deployed into production).
That includes things like the way you use HTML and MIME, the way you register the domain names you use and the way you use them as URLs in messages and a bunch of other things. Being aware of the sort of things that content-filters like SpamAssassin look at is a good place to start.
Posted by laura on 26 Nov 2007 | Tagged as: Deliverability, Legal
I’ve been working on a document discussing laws relevant to email delivery and have found some useful websites about laws in different countries.
US Laws from the FTC website.
European Union Laws from the European Law site.
Two documents on United Kingdom Law from the Information Commissioner’s Office and the Data Protection Laws
Canadian Laws from the Industry Canada website.
Australian Laws from the Australian Law website
Posted by laura on 19 Nov 2007 | Tagged as: News Articles
Things have been insanely busy the last few days so blogging has been light. I do have links to a few news articles though. ClickZ has a report on the benefits they saw when switching to a professional email service provider. ReturnPath talks about changes to the email landscape as we enter the holiday shopping season. Terry Zink talks about how he measures the effectiveness of filters. A commenter on this blog asked about how to improve delivery to AOL, and I should have an answer to that in a few days.
Posted by laura on 14 Nov 2007 | Tagged as: Deliverability, Relevancy
I had a call with a potential client recently asking me what was the best day to send mail. It’s a question that I did not have a good answer to. Email Insider does have an answer to that question: there is no one day to mail to get the best response.
Even if there were one universal best day to send email, it wouldn’t make sense to send your email the same day as everyone else. In the world of direct mail, a truism was that January was a bad month to mail, being just after the holidays, etc. We had great success with January mailings, precisely because of this thinking. We didn’t have much competition in our customers’ mailboxes.
What you’re trying to find is that magic moment when your customer is online with spare time and mental bandwidth. Think of your own experience. Can you tell me that there is a specific day or time of day when you predictably reach this marketer’s nirvana, week after week? Of course not. You may be bored on the weekend and do work-related browsing, or take a break at work for some well-deserved shopping.
This fits with the data I have seen from clients over the years. And, really, if everyone mailed on the exact same time on the same day, then recipients really would be overwhelmed and not answer any of the mail.
Posted by laura on 13 Nov 2007 | Tagged as: ISP, Industry, Meta
I added a few blogs to my blogroll today.
Terry Zink works at Microsoft handling spam blocking issues for one of their platforms. His posts offer insight into how recipient administrators view spam filtering. He has a long, information dense series of posts on email authentication.
E-mail, tech policy, and more is written by John Levine, a general expert on almost everything internet, especially spam and abuse issues. He posts somewhat irregularly about interesting things he sees and hears about spam, abuse, internet law and other things.
Justin Mason’s blog contains information from the primary SpamAssassin developer. Like Terry’s blog, it gives readers some insight into the thought process of people creating filters.
Al Iverson’s blogs have been on my blogroll for a while now. His DNSBL resource contains information about various DNSBL and how they work against a single, well defined mail stream. His spam resource blog provides information about delivery and email marketing from someone who has been in the industry as long as I have.
Email Karma is Matt Verhout’s blog and contains a lot of useful delivery information.
No man is an iland provides practical information on marketing by email. Some of the information is delivery related, a lot more of it is solid marketing information. Mark often points to useful studies and information posted around the net.
MonkeyBrains has always entertaining and informative articles about delivery, email marketing and practical ways to make your email marketing more effective.
Posted by laura on 12 Nov 2007 | Tagged as: Confirmed (double) opt-in, Permission
Ben over at MailChimp writes about spamfilters that are following links in emails resulting in people being unsubscribed from lists without their knowledge. I strongly suggest clients use a 2 step unsubscribe system, that does not require any passwords or information. The recipient clicks on a link in the email and confirms that they do want to be unsubscribed once they get to the unsubscribe webpage.
Even more concerning for me is the idea that people could be subscribed to emails without their knowledge. For some subset of lists, using confirmed (double) opt-in is the best way to make sure that the sender really has permission from the recipient. Now we have a spam filter that is rendering “click here to opt-in” completely useless. I am sure there are ways to compensate for the stupidity of filters. As usual, though, the spammers are doing things which push more work off onto the end user and the legitimate mailers.
Posted by laura on 08 Nov 2007 | Tagged as: Blocking, ISP, Industry, Yahoo
Over the last couple days multiple people have asserted to me that Yahoo is greylisting mail. The fact that Yahoo itself asserts it is not using greylisting as a technique to control mail seems to have no effect on the number of people who believe that Yahoo is greylisting.
Deeply held beliefs by many senders aside, Yahoo is not greylisting. Yahoo is using temporary failures (4xx) as a way to defer and control mail coming into their servers and their users.
I think much of the problem is that the definition of greylisting is not well understood by the people using the term. Greylisting generally refers to a process of refusing email with a 4xx response the first time delivery is attempted and accepting the email at the second delivery attempt. There are a number of ways to greylist, per message, per IP or per from address. The defining feature of greylisting is that the receiving MTA keeps track of the messages (IP or addresss) that it has rejected and allows the mail through the second time the mail is sent.
This technique for handling email is a direct response to some spamming software, particularly software that uses infected Windows machines to send email. The spam software will drop any email in response to a 4xx or 5xx response. Well designed software will retry any email receiving a 4xx response. By rejecting anything on the first attempt with a 4xx, the receiving ISPs can trivially block mail from spambots.
Where does this fit in with what Yahoo is doing? Yahoo is not keeping track of the mail it rejects and is not reliably allowing email through on the second attempt. There are a couple reasons why Yahoo is deferring mail.
In the first case, the shedding of load means nothing more than Yahoo is shedding load. There is not really anything the sender can do to compensate for this, nor is there any thing the sender is doing (except possibly send mail to Yahoo at the same time as the rest of the world) to precipitate the blocking.
In the second case, these are more specific refusals and there are things senders can do to minimize the deferrals.
Even the best mailers sometimes see deferrals at Yahoo. However, because Yahoo is using a temporary rejection, unless there are significant problems with your mailings, the mail will get through.
Posted by laura on 07 Nov 2007 | Tagged as: Humor, Industry
Ken Magill has a post up mentioning the top 40 companies in email marketing. Some highlights:
- Goodmail: This firm has been under fire and in the news so often that it has helped me make more deadlines than any other company on this list.
- SubscriberMail: They’re in Chicago.
- ExactTarget: They’re in Indianapolis.
Editor’s note: If your company did not appear on The Magilla Marketing List of Top 40 Fastest-Growing (and some not-so-fast-growing) E-mail Marketing Related Companies that Came to Mind Randomly while Swilling Vodka Martinis list and you think it should have, e-mail us. We’ll do another list. Heck, we’ll do ‘em all year! But you have to put out a release. Hmmm … I smell a business model.
Posted by laura on 05 Nov 2007 | Tagged as: AOL, ISP, Industry, Juno/Netzero/UOL, MSN/Hotmail, RoadRunner, Yahoo
A number of ISPs have email information and postmaster sites available. I found myself compiling a list of them for a client today and thought that I would put up a list here.
Posted by laura on 02 Nov 2007 | Tagged as: Industry, News Articles
DMNews interviewed Charles before he left AOL about the state of spam and the challenges for ISPs and how that affects senders. The article was published this week. In it he talks about
All in all a good article and worth a read for someone interested in what goes on behind the scenes at AOL.