I’m stoked that our clients are seeing more and more success. I think a lot can be said for our Marketing department, who have helped them get more visibility on the web. As a developer, however, a client’s success also brings a few interesting challenges, such as finding ways to squeeze more power from the same servers. Caching is something you can’t leave home without, and avoiding cache stampedes is just as important as the caching itself.
A cache stampede occurs when a cached item expires and multiple clients attempt to repopulate the cache at the same time. Take for example one of your page cache expires and a few thousand people try to refresh the generated HTML. That’s a lot of instant load hitting the database, re-saving the cache and so-on: it’s a lot of extra processing that can bring a website to its knees.
One of the many ways to avoid this from happening is to apply pseudo-locks to a cache (and happens to work great with Memcache). It’s essentially a routine calling dibs on re-populating the cache and everyone else waits for the routine to complete. As it turns out, the added overhead of checking for a cache lock before attempting to fetch the actual cache is extremely efficient, especially compared to allowing multiple clients to fight over a cache refresh. Here’s a little snippet about what I’m talking about:
/** @var $lockCache Zend_Cache_Core */
/** @var $pageCache Zend_Cache_Frontend_Output */
$cacheId = 'Page_' . crc32($_SERVER['REQUEST_URI']);
$lockId = $cacheId . '_lock';
if ($lockCache->test($lockId)) {
$i = 0;
while ($i < 3) {
sleep(0.5);
if (!$lock->test($lockId)) {
break;
}
$i++;
}
}
if (!($pageCache->start($cacheId))) {
$lockCache->save(array('locked' => true), $lockId, array(), 10);
// Run your application here.
$pageCache->end();
$lockCache->remove($lockId);
}
In the above code, we’re checking if a lock exists on the $cacheId we’re accessing and waiting until it’s no longer there. The wait time and intervals should be something you tune to best fit your application. For the routine that is refreshing the cache, it sets up the cache lock entry and puts a short (10 second) lifetime on the lock and removes it once the job is done.
While the example code is for output caching, you could apply this to your data access objects or anything really that needs caching. It works well for time-invalidated caches, as opposed to caches invalidated by actions, like an author publishing a news article.