Skip to content


Useful bits of Cryptography – Hashes

More than just PGP

Cryptography is the science of securing communication from adversaries. In the email world it’s most obvious use is tools like PGP or S/MIME that are used to encrypt a message so that it can only be read by the intended recipient, or to sign a message so that the recipient can be sure of who it came from. There are quite a few other aspects of sending email where a little cryptography is useful or essential, though – bounce management, suppression lists, unsubscription handling, DKIM and DMARC, amongst others.

One-way hashes

A hash function converts any text you give it into an opaque string of gobbledygook called a digest. As an example, one commonly used hash function is “md5″ and the md5 hash of “Word to the Wise” gives “a1606a9079b1c15a521c6d04344dfb62″ as a digest. Using the same hash function on the same string will always give the same digest, and different strings will always[1] give a different digest. They’re like a fingerprint. And you can’t “decode” a digest back into the original string[1].

Hash functions are used for all sorts of things, from checking when a file has changed to securely storing passwords. Because of that they’re easily available from pretty much all programming languages and from a windows or unix command line.

One thing hashes are useful for is sharing information when you don’t entirely trust the person you’re sharing it with. If, say, you have a suppression list of email addresses and you want someone else to remove them from their database, but you don’t want to actually share the list of email addresses with them. You can go through your suppression list and generate the digest for each email address, then send the other party the list of digests. They can go through their database and generate the digest for each of their email addresses. If that digest is in the list you sent them they know that email address needs to be suppressed. But they can’t find out the other email addresses on your suppression list. (The advanced version of this would have you share a salt with the other party).

More generally, they allow two people to identify which email addresses they have in common, without revealing to the other the email addresses they don’t possess.

Another thing they’re useful for is making something resistant to tampering. Say you’re using something like “bounce-547-steve=wordtothewise.com@bounce.example.com” as your email’s return path, as part of VERP-based bounce management, where the number represents the customer and the rest of it represents my email address. This makes handling bounces fairly simple, as when you receive mail to that address you just need to go into your database and note that my email address is bouncing mail for customer 547 and may need to be suppressed.

But what if I’m having a fight with someone else on that mailing list, and I send a fake bounce to “bounce-547-laura=wordtothewise.com@bounce.example.com”. As far as your bounce management automation is concerned, that means that mail to laura is bouncing and mail to her should be suppressed.

If you’re rather not be open to that sort of attack then you could use a hash function to cheaply “sign” the bounce address, so that it can’t be faked in that way. You choose a secret word, perhaps “12marmalade”. Each time you send an email you take the local part of your old-style return path and your secret word and mush them together to give a string like “bounce-547-steve=wordtothewise.com12marmalade” and then you use the md5 hash function to get the digest “a782eac364cc5a1dcbab2705495fe7a7″. You use the digest to create a new return path “bounce-547-steve=wordtothewise.com-a782eac364cc5a1dcbab2705495fe7a7@bounce.example.com”.

Now when you get a bounce message sent to that address you can take the customer ID and email address part of it, add your secret word and find the md5 digest of that string. If it matches the one in the address you know it was a valid bounce, and you should suppress mail to the address. If it doesn’t match, it’s a forgery and you shouldn’t. Now I can’t fake up bounces from Laura and get her bounced off of mailing lists. If the full 32 character digest seems a bit excessive you can just use a substring of it – even 8 characters is plenty to protect against attacks.

You can use the same sort of approach to add some basic security to things like unsubscription links, opt-in confirmation links and so on. Using a hash-based signature to protect those URLs not only allows you to be sure that the person clicking the confirmation link has the confirmation mail you sent out, it makes it easier to demonstrate if you later need evidence of that.

[1] Yes, I’m simplifying slightly. Read Applied Cryptography if you want the longer version.

Tags: , , , .


Confirming addresses in the wild

A lot of marketers tell me “no sender confirms addresses” or “confirming addresses is too hard for the average subscriber.” I find both these arguments difficult to accept. Just today I subscribed to a mailing list that had a confirmation step. The subscription form was pretty simple.

Note the option to receive emails optimized for mobile.

I entered my email address into a webform, hit submit and was taken to another page.

The cut off text says "you'll need to click on the link we sent you to complete the process."

Once I confirmed, I was taken to a thank you page and given the option to modify my mail preferences.

Of course, it’s possible this particular sender is more sophisticated than the average marketer. Take the link labeled “add to address book.” When I clicked this link it downloaded a .vcf card, opened up my address book and set me up to be able to trivially add their sending address to my address book.

Clearly “no one does it” is a poor argument. I don’t sign up for many lists at all. But if I can find examples of companies using confirmation, it can’t be that rare.

Tags: , , , .


Things Spammers Do

Much like every other day, I got some spam today. Here’s a lightly edited copy of it.

Let’s go through it and see what they did that makes it clear that it’s spam, which companies helped them out, and what you should avoid doing to avoid looking like these spammers…

Received: from [213.144.59.132] (114.sub-75-210-142.myvzw.com [75.210.142.114] by m.wordtothewise.com (Postfix) with SMTP id DEA552EAE2

This tells me it was sent from Verizon wireless network space – which means it’s almost certainly spam, as legitimate mail doesn’t come directly from cellphones or cellular access points, it comes from smarthosts. And it also tells me that the spammer is lying about who they are, claiming to be “[213.144.59.132]” when they’re really not.

X-Spam-Status: No, score=1.7 required=7.0 tests=HTML_EXTRA_CLOSE,HTML_MESSAGE, RCVD_IN_PBL,RDNS_DYNAMIC autolearn=disabled version=3.2.5

This line was added by SpamAssassin running on my mailserver. HTML_MESSAGE isn’t very interesting – it just says there was some HTML in the mail – but the others are fairly strong signs that it’s spam. HTML_EXTRA_CLOSE is one of many spamassassin rules based on the HTML content of the message being malformed in some way, suggesting it was created by badly written software such as spamware.

RCVD_IN_PBL and RDNS_DYNAMIC are both really strong signs that no email from this Verizon IP address is legitimate, but in different ways. RDNS_DYNAMIC shows that Verizon hasn’t done anything special with the IP address to suggest it might be a legitimate server – it’s in the vast wasteland of consumer IP addresses that nobody really cares about, and not somewhere you should expect legitimate mail from. RCVD_IN_PBL is much more specific – it tells us that Verizon explicitly told Spamhaus that no email should ever be emitted from here (a provider that cared about spam might actually block traffic on port 25 from that sort of space, but we’ll take what we can get). If you ever see either of these on mail, it’s spam.

From: “Tom Joelson” <Noreply234239-389512@qmail.com>

Legitimate mail would have a company name, or maybe a personal name I’d recognize in the “friendly from”. Strike one. Legitimate mail wouldn’t have the word “noreply” anywhere in it – telling your recipients you don’t want to hear from them is rather disrespectful. Strike two. Random numerics in the From field are really bad: as well as looking like you’re trying to pull a fast one they’d make it impossible for a recipient to whitelist your mail. That sort of thing is fine in the return path, as part of VERP encoding, but not in the From address that’s visible to the recipient. Strike three. Qmail.com is an asian freemail provider – legitimate bulk mail never claims to be from someone it isn’t, and is never from a freemail provider. Strike four.

… check out the attached brochure for more information …

There’s very seldom a legitimate reason to have an attachment in bulk email, for several reasons. The email should stand on it’s own, giving the recipient the information they need in a form that’s immediately visible in their mail client. Links to your web page, sure, but the mail should make sense on it’s own, with the links part of a call to action. If you’re sending out mail to existing customers it might occasionally be useful to attach a PDF copy of a catalogue or somesuch, but the content of the email should still stand on it’s own (and given the security flaws in PDF that allow it to be used as a payload for viruses I’d be wary about doing even that).

Click This Link to Stop Future Messages =
<mailto:listservices@gmx.com?subject=3DUnsubscribe%3A%20myemail@mydomain>

Sure, you should have an unsubscription link in the messages you send. But it should be to an unsubscription page on your webserver, not a mailto link that sends mail anywhere, let alone to a dubious freemail provider (I’m prepared to believe gmx.com has legitimate users, but I’ve never seen it used anywhere other than in spam). And the clumsy phrasing looks like an attempt to avoid naive content filters.

All these things told me, and would have told a decent spam filter, that this wasn’t legitimate mail. Let’s dig down further and see how the spammer tried to avoid being identified.

The attachment is an HTML document, and it’s been base64 encoded. There’s never a good reason to use base64 encoding for English language attachments, unless you consider hiding the content of your email from naive spam filters a good reason. Less naive spam filters will decode the attachment and look inside it anyway. And they might consider the dishonest use of base64 encoding a bad enough sign in itself.

We can easily decode the base64 by hand, either by using a web based decoder or from a random unix-ish commandline by typing “openssl enc -d -base64″, hitting return, pasting in the encoded text and hitting ctrl-d.

That gives us this:

<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN”>
<html><head>
<META HTTP-EQUIV=”Refresh” CONTENT=”0; URL=http://cts.vresp.com/c/?ChristianCafe/207bd98dc8/TEST/025270d267/id=11416″>
</head></body></html>

What that snippet of HTML will do when you open it is immediately redirect to the URL given in the middle. I recognize cts.vresp.com as VerticalResponse‘s clickthrough redirector, so it looks like a spammer created a test account at VerticalResponse in order to be able to abuse their redirector to hide the final destination. Naughty spammer.

If I didn’t recognize the URL as belonging to VerticalResponse, though, I’d visit the obvious webpages to see who it is. http://cts.vresp.com/ just tells me “Forbidden”. Bad.  http://vresp.com/ just says “hola”, which isn’t a good sign either. I’m not sure whether http://www.vresp.com/ is better or worse – it doesn’t mention the real company name and claims it’s “a domain that sends permission-based emails”. That’s really fishy, and looks just like many, many dedicated spammer domains. The lesson to learn is that if you use a domain in your email, then there should be a webserver at any of the related hostnames, it should tell anyone visiting it what the domain is used for, the real name of the company that’s operating it and provide a link to their corporate website.

Let’s see where the VerticalResponse redirector sends us to. This is pretty easy to do using telnet from a unix commandline or a windows command prompt. We’re looking at the URL http://cts.vresp.com/c/?ChristianCafe/207bd98dc8/TEST/025270d267/id=11416, which I’m going to split into the host “cts.vresp.com” and the path “/c/?ChristianCafe/207bd98dc8/TEST/025270d267/id=11416″. You just have to type the bits in blue, and remember to hit return twice after the Host: line.

steve@ubuntu:~$ telnet cts.vresp.com 80
Trying 74.116.90.234...
Connected to cts.vresp.com.
Escape character is '^]'.
GET /c/?ChristianCafe/207bd98dc8/TEST/025270d267/id=11416 HTTP/1.1
Host: cts.vresp.com

HTTP/1.1 302 Found
Date: Mon, 14 May 2012 18:36:58 GMT
Server: Apache
Location: http://www.christiancafe.com/guests/join/indexc.jsp?id=11416
P3P: policyref="https://cts.vresp.com/w3c/p3p.xml", CP="CAO DSP COR IVAo IVDo OUR STP PUR COM NAV"
Cache-Control: max-age=0, no-store, no-cache, must-revalidate
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html

(If you try and do this yourself you’ll discover that VerticalResponse have already shut down this redirector in response to an abuse report. Thanks, VR.)

And christiancafe.com is our spammer.

There’s more I could say about how they’re hosting their website (on Amazon EC2 Web Services, with suspiciously short DNS TTLs) but I think this is more than enough for one blog post.

Tags: , , , , , .


Fickle recipients

One of the tenets of good delivery is know your recipients. Woot.com seems to know their recipients.

 

Happy Friday.

Tags: , .


Email is different

OMI responded to my post about data cleansing yesterday. She asked an interesting question:

Why do so many in this industry feel that the email channel should be somehow held to a higher standard than other direct marketing channels?

There are a lot of reasons why the email channel is held to a higher standard. The big one is actually that the consumers have a big enough stick (in the form of ISPs and filters) to wield against senders that annoy them. This actually boils down to who owns the channel.

In many cases of advertising, marketers own the channel. Direct postal mail, banner ads, radio and TV ads, those channels are all developed the use of marketers. Marketers can use the channel as long as they pay the owner: the TV station, the billboard company, the radio station, the website.

In all those marketing channels there is some monetary cost to increasing frequency and some non-marketer-controlled limit on how frequent you can touch the target. There are only so many minutes available for marketing in a TV or radio hour and they cost real dollars. There’s only so much page space available for press. Billboards cost real money and you can’t just put a billboard up anywhere.

But email is very different. First off, the channel wasn’t built with the idea that it would be funded by marketing. Secondly, the recipient (or their proxies in the form of the ISPs) own the email channel. This changes not only the economics, but also the constraints.

Because it costs so little for marketers to send more mail, there are no real constraints on the amount they can send. On the recipient end, though, there are major constraints on the amount of attention they can give to mail. The more marketing mail they get from any source, the less ability they have to focus on any one offer.

Email is different because it is not solely a marketing channel.

Email is different because the recipient has more control.

Email is different because marketers don’t pay the full cost of transmission.

Email is different because recipients pay for part of the marketing.

Marketers are held to a higher standard because email marketing is subsidized by recipients and recipient ISPs.

Tags: , .


Data Cleansing part 2

In an effort to get a blog post out yesterday before yet another doctor’s appointment I did not do nearly enough research on the company I mentioned selling list cleansing data. As Al correctly pointed out in the comments they are currently listed on the SBL. And when I actually did the research I should have done it was clear this company has a long term history of sending unsolicited email.

Poor research and a quickly written blog post led to me endorsing a company that I absolutely shouldn’t have. And I do apologize for that.

With all that being said, Justin had a great question in the comments of yesterday’s post about data cleansing.

Isn’t this contrary to the good habits we are always preaching? If we send *email people want* to an engaged, opted-in group of people who want our mail, why would there ever be a need to clean our lists?

Yes, a lot of list cleaning services are used to take non-permissioned lists and turn them into lists that don’t cause delivery problems.  But there are other reasons to clean lists and even clean permission lists.

I fully believe that mail should be sent to people who ask for the mail. I strongly believe the recipient should have some measure of control over what advertising and commercial email they receive. I also believe the recipient is the final arbiter of whether a mail is wanted or unwanted. I believe a legitimate sender must to respect the recipient’s time and attention.

With those principles clearly stated, when might list cleaning be an appropriate process? List remediation is the big one.

We’re hitting the point where some email lists or customer databases with email addresses have been around for almost a decade. There’s a lot of cruft that can accumulate in a database in 10 years. There are going to be addresses with no audit trail. Even newer databases can have a lot of entries without full audit trails.

Some databases have addresses that aren’t mailed regularly. I’ve certainly had clients that would segment enough that some addresses wouldn’t be mailed more than once or twice a year. These types of databases aren’t always kept up as well as we might hope or like.

For these databases, a list cleaning process is good and even necessary. Bad addresses accumulate on lists. One of the things I do with clients is help them separate out good addresses from bad addresses. But each case is unique and requires individualized treatment. Sure, you can run a list against a database of 300 million addresses and remove some bad ones, the ones that might get you into delivery trouble. But not all bad marketing creates delivery problems. Sometimes bad marketing is just bad. Mail gets into the inbox, sure. The source or the content isn’t blocked. But I think marketers can do more than just get mail into the inbox.

Data cleansing is not just about removing spam traps and bouncing addresses. Data cleansing should be about identifying those people who are going to buy from you. And not everyone who was interested in your product a few years ago is going to be interested in your product now. People change, their wants and needs change. They are not static, but rather fluid. Just removing problem addresses isn’t going to find those customers as effectively as searching for the good addresses in your list.

Tags: , , , , , , .


Data Cleansing

According to Ken, Outward Media has productized a database of 300,000,000 email addresses that should never be mailed.

OMI’s Clean-Send Suppression Database can help to protect your email sender reputation and save you valuable marketing dollars.
In a nutshell, OMI Clean-Send is a database consisting of approximately 300 million negative email records (spam traps, foreign IP’s, hard bounces, and other negative email data). Due to the erosive nature of consumer email, we have joined with a consortium of email partners who share the OMI philosophy that data quality is far more important than data quantity.

It’s an interesting idea. I certainly have a lot of clients who have come to me looking for ways to clean old lists and data of unknown provenance. This might be worth looking at.

Tags: , , .


Why so many domains

There’s a company that advertises a lot on TV. The ads are well done, they tell a clear story in the 30 seconds. They feature a pretty and happy young woman dancing around. There is a great catchy tune. From all appearances it’s a successful ad campaign.

The point of the ad campaign is to drive traffic to a website where the domain owner can collect a lot of information and sell it on to advertisers. Every month or so, the landing URL changes. In watching this campaign over the last year or two, I’ve seen at least half a dozen different URLs used in the television ads. Now, it’s perfectly possible that this is part of an overall strategy, but I am not sure. The initial website is highlighted so clearly in the catchy tune, I can’t believe it is part of their marketing strategy.

Which leads me to wonder if there is a bigger problem with their advertising. Do they change domains so frequently because they’re seeing domain based blocking?

Tags: .


You opted in

One thing I get in some of the comments here and in some of the discussions I have with email senders is that no commercial emailer ever sends unsolicited email. That, clearly, at some point the recipient opted in to receive mail and if that person doesn’t want mail they shouldn’t ever give out their email address.

I have an old yahoo address that’s used primarily as my Flickr account login. I don’t believe I’ve ever given out the address to anyone or opted in to anything. Anything’s possible, this address was created sometime in 2006 or 2007 and I may have tossed it into a form to test something. It’s certainly not an address I ever actually use.

Earlier this week I checked mail on the account. There were almost 700 messages in there. It was pretty amazing how much garbage this unused, unshared address collected. Notice the “clever” use of foreign alphabets and the number of legitimate companies who have acquired this address or hired people to mail me on their behalf. I’m sure some of it is phishing, too.

Inbox picture

And this is the view from behind some very aggressive filters.

All in all, though, this is a prime example of how many companies are not following best practices and are actively sending spam.

Tags: , , .


AOL improving

I’m hearing from lots of folks that they’re seeing some improvement in delivery to AOL accounts.

As everyone can imagine, the AOL situation has been a common thread of discussion on many delivery lists. One person even commented at how fragile the AOL mail server seems. My own thoughts are a little different. The AOL mail system is notoriously complex and integrated. Many of the folks who built it have been laid off or otherwise moved on to other companies. I know there are still smart, competent people riding herd on the AOL mail servers, but I expect they don’t have the resources to do the ongoing maintenance and the fire fighting and all the other tasks that a mailserver handling billions of emails needs.

What this means is that the AOL mail system has been suffering from bit rot for at least 2 years. It is to the original designers’ credit that it’s taken this long before there were major problems like we’ve seen over the last week.

Tags: .




Follow me on Twitter