Gmail deploys image proxy servers

G

This afternoon Justin Foster of LiveClicker posted to the OnlyInfluencers list asking about Gmail rewriting links.

Sometime very recently (last 24-48 hours), we are seeing that Google made a change to Gmail such that all image URLs in the email content are replaced by a call to Google’s content caching service googleusercontent.com.
For example, an image with the src: “http://mysite.com/i.jpg” will be replaced by Gmail with a URL something like this: https://ci3.googleusercontent.com/proxy/…#http://mysite.com/i.jpg

After some investigation, testing and talking with people at various ESPs, I can confirm that Google is rewriting image links. This rewriting appears to be happening during the delivery process. Older messages that are currently in mailboxes aren’t showing this tracking.
Many marketers are concerned about this. The first concern is always about open tracking and how this will affect engagement metrics.
Normal open tracking happens when a user opens an email and loads images into their mail client. Each email address is given a unique image name so that the sender knows who loaded the image. Every time a user opens the email, the image is reloaded from the image server.
In the new Google setup, the first time an image is opened, Google downloads the image from the image server and caches it on a Google managed proxy. This means that the first image load can be tracked by the sender, but any subsequent image loads will not be tracked.
For senders, this means that only the first open of any individual image will be recorded. When someone opens a mail, Google will check to see if that image is in their cache, if it isn’t, then they follow the link, load the image and put it in the cache. Any time someone tries to load that same image, whether the same or a different recipient, Google will serve the image from the cached page.
For global images, this means only one user has to open the mail and the images are pulled from the server. In the case of tracking images, every image file name is unique. Every new open will cause Google to grab the uniquely named image. The result is that senders can track the first open, but no subsequent opens.
We identified the following string from an open at Google

66.249.84.36 - - [05/Dec/2013:13:00:55 -0800] "GET /zimbabwe.png
HTTP/1.1" 200 2867 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1;
de; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 (via ggpht.com)"

which is a Google proxy string.
Images aren’t just used for open tracking, however. There are a number of services which provide geo-specific images depending on where the images are opened from. This new proxy is going to break that. I’m also hearing of at least one email services provider that is seeing no opens from Google today, possibly because of how their images are interacting with the proxy server.
In any case, this is an issue we’ll be keeping a close eye on.

About the author

23 comments

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  • I didn’t test whether there was any per-user aspect to the caching – it’s possible that if userA@gmail loads an image then they won’t see the identical image cached by userB@gmail, but I doubt it.
    Both uniquely named images, and images using unique CGI-style parameters (e.g. img.gif?user=42) are being passed through to the original webserver on first load, when the email is opened and the image viewed.

  • And we’ve now tested it and can definitively say that if a second user opens the same image link, there is no reload from the image.

  • Research so far shows that this has been happening since December 3rd, you can see that calls to your servers will spike from Mountain View, CA
    Thats when they started pulling images. However so far we have not seen any issues with Open tracking, however GeoLocation data is a mess as everyone appears to be coming in from CA.
    But we are recording multiple opens for the same image and same user, so tracking for us at Campaign Monitor at least is completely unaffected. The Geo Location is a concern however…

  • … and there is nothing distinctive to the recipient in the request from the Google proxy to the original webserver other than the URL.
    The total traffic (from a wireshark trace) looks like this:
    GET /path/to/image.png?user=42 HTTP/1.1
    Host: host.name
    Connection: Keep-alive
    User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; de; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7 (via ggpht.com)
    Accept-Encoding: gzip,deflate
    That User-Agent string isn’t the recipients User-Agent – it’s a (presumably fake) string being sent by Google.

  • Interesting, Andrew. Steve sent me an email with the same image he’d already opened and there was no hit on our web server. But you are seeing multiple opens for some images?

  • Hi Laura
    We are seeing multiple opens for some images… I have been looking specifically at our tracking pixels.
    Not sure how long the TTL is on the cacheing of their images, or if they appear to be calling just tracking images every time an email is opened. Behaviour between 1×1 pixel images may well be different to the rest of the content.. so many variables! :/

  • It’s not going to cache an image forever. If it’s cached for ten minutes and someone opens the mail twice, fifteen minutes apart, you’ll see two loads.
    It’s just a cache, so it’s allowed to expire cached files using whatever heuristic it likes without changing the behaviour seen by the end user. So if you see a single image retrieved ten times in a day then it was viewed at least ten times. Could have been ten. Could have been fifty.

  • Hi Steve
    Agree completely on the first part of your comments about the cacheing element, as I said I have no idea what their TTL is on images, and whether they treat tiny images (1×1 pixel) differently to others in an email.
    What I do not see the same as you… In my tests : Open tracking images are recording multiple opens within a minute. On that basis if I record 10 opens I believe it to be 10 opens and slim to no chance of the real figure being 50 opens

  • Have been doing some testing with competing EDM platforms, and it seems others are not all able to record multiple opens in a small time frame.
    One notable example shows only one open recorded over a 5 minute window, despite multiple opens. However changing device results in a new open being rendered.
    I think the differences in how EDM platforms are adding tracking images etc is likely to provide others to clues as to how to get round the Gmail cache issue

  • @Steve. Don’t bet against Google caching images forever – whatever they are doing fo far. As with file lockers, if the same image (or two images with identical content) is loaded twice then only one copy of the data needs to be stored, so the data requirements are modest. If the intention is to speed up GMail, then caching forever (or at least for a very long time) would achieve this best.

  • Hi Guys,
    Were over in the UK and seeing the same thing on our side a big swing in geo-location information over to the US and our device Analytics is showing the Google proxy as a big email client.

  • with this change, now every time i embedd an image with in my email body, there is no image.. it is just a cross sign.. telling me the image is not there.
    Please suggest how to fix this.

  • Been looking for articles about Google’s image proxy and just had to give you a thumbs up for zimbabwe.png … just because I’m from Zimbabwe, a tiny country which probably only gets mentioned because it’s the last name in the list of countries.

By laura

Recent Posts

Archives

Follow Us