Static hostname hashing in Pound

1 Nov

WordPress.com just surpassed her 300th server today. How do we distribute requests to all those servers? We use Pound of course. For those of you not familiar with Pound, it is an open source software load balancer that is easy to setup and maintain, flexible, and fast!

In general, we do not stick individual sessions to particular backend servers because WordPress uses HTTP cookies to keep track of users and is therefore not dependent on server sessions. Any web server can process any request in any given point of time and the correct data will be returned. This is important since serve traffic in real time across three data centers.

There is one exception to this rule, however, and it has to do with the way we serve images. As Demitrious explained in his detailed post, when a request for an image is made, pound sends the request to a cache server running Varnish. How does it decide which server to send the request to? Well, it looks at the hostname of the request, hashes it, and then assigns that to a particular cache server. By default Pound supports sessions based on any HTTP header, so we could easily use the hostname as the determining factor, but the mapping is not static. In other words, when we restart pound, all the hostname assignments would be reset and we would effectively invalidate a large portion of our cache.

To circumvent this problem, please see the following patch. What the patch does is statically hash hostnames so a given hostname is sent to the same server all the time, even across restarts. If the backend server happens to go down, the requests will be sent to another server in the pool until the server is back up, at which point the requests will be sent to the original server. This allows us to restart pound without invalidating our image cache. We have been using this in production for a couple months now and everything is working great. The patch is written against Pound 2.3.2 and to use the static mapping you would add the following to the end of the Service directive in your Pound configuration file:

Session
Type hostname
End

One thing to keep in mind is that if you add or remove servers from the Service definition, you will change the mapping, so I would recommend adding a few more backend directives than you need right away to allow for future growth without complete cache invalidation. For example, we currently have 4 caching servers, but 16 BackEnds listed (4 instances of each server). This will allow us to add more cache servers and only invalidate a small portion of the cache each time.

Of course this works for us because each blog has a unique hostname from which images are served (mine is barry.files.wordpress.com). If all of your traffic is served from a single domain name, this strategy won’t do you much good.

6 Responses to “Static hostname hashing in Pound”

  1. James Byers November 13, 2007 at 6:08 pm #

    Thanks for open-sourcing that patch, it’s a nice addition to pound.

    I wonder if you could comment sometime on how you handle routing traffic between datacenters and how that’s evolved over time? I know at one point WordPress.com was using something from netli for this, now it looks like Akamai. I’m curious if these products really get you timely failover and if they’re worth the cost.

  2. Barry November 16, 2007 at 5:22 am #

    James, we are in the process of changing how we handle datacenter failover for WordPress.com, so I will write a post about that soon.

  3. Steve December 1, 2010 at 7:14 pm #

    Pound ftw ;]

Trackbacks/Pingbacks

  1. Small world gets smaller « Andys Techie Blog - November 1, 2007

    [...] when someone from America who is involved with one of the biggest blogging platforms on the planet (just added their 300th server!) can be complimentary about my local [...]

  2. Donncha’s Friday Links at Holy Shmoly! - November 2, 2007

    [...] reveals all, about how WordPress.com serves files and pages that [...]

  3. Load Balancer Update « Barry on WordPress - April 29, 2008

    [...] server We are currently using Nginx 0.6.29 with the upstream hash module  which gives us the static hashing we need to proxy to varnish.  We are regularly serving about 8-9k requests/second  and about [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 852 other followers

%d bloggers like this: