4 March 2010 0 Comments

High performance Grails with memcached

This article is the second one in my article series about fast loading web pages. The first article dealt with the Django Framework while this one is about Grails which I have recently elected as my preferred rapid web application framework. Ever since I switched from Django to Grails there was one issue that bothered me and that was the loss of the ability to serve complete pre-rendered pages from a distributed cache to the enduser while bypassing the application server.

The solution described in this article is in use on the production servers powering mmogle.com where it ensures page response times below 100ms for cached content.

Let me summarize the concept for readers not familiar with my first article: The goal we want to achieve is to greatly decrease the load on the application server (container) by storing the raw HTML output of rendered pages in a (distributed-) cache to be picked up by a Frontend Web Server without even bothering the application server behind it for the current request. Since we would need stuff like Edge Side Includes (ESI) for dealing with personalized pages, we are going limit ourselves to anonymous users.

This picture below shows the server setup:

nginx-memcached

Nginx will act as frontend web server. This is where all requests will initially arrive. Requests for static resources will always be handled by nginx. Dynamic requests will be scrutinized and once it is determined that a request could be cached, a cache key is computed based on request URI, cookies etc. Nginx will then proceed with querying memcached if one of the application servers has deposited content under that key. If the answer is positive the response from memcached is directly returned to the client – the application servers are bypassed. Perhaps it is worth noting that the application server (tomcat) is only responsible for storing the content, whereas nginx – acting as frontend server – is responsible for retrieving it. This decoupling is what sets this technique apart from the Grails Cache Filter Plugin which both stores and retrieves cached content.

Let’s continue with the actual implementation. When I decided to port the technique I described in the aforementioned article from Python/Django to Groovy/Grails, I thought it was going to be a very easy task. All that’s needed is to grab the rendered page contents within a Grails afterView Filter, put it into memcached and be done with it, right? Wrong! Imagine my surprise when I noticed the rendered output never had a layout applied. After lots of trying I gave up, realising that there is simply no way to get my hands on the complete HTML (emphasis on complete) for a rendered Grails page from within the application itself. The reason for this is Grails layouts are based on Sitemesh which is implemented as a ServletFilter. Grails applies the Layout to pages through its GrailsPageFilter ServletFilter which is executed further down in the pipeline after any Grails Filters.

That left me with only one option: a custom ServletFilter that executes before GrailsPageFilter. So I checked out the source of the Grails Cache Filter Plugin, stripped it off anything not necessary for our purpose and born was MemcachedFilter. You can download the source here.

MemcachedFilter is written in Groovy and it needs to be configured to execute before GrailsPageFilter. That means that it’s filter mapping element must appear before the filter mapping for GrailsPageFilter in web.xml. Here’s a sample web.xml snippet:

<filter>
  <filter-name>memcached</filter-name>
  <filter-class>com.banshee.servlet.MemcachedFilter</filter-class>
</filter>
 
<!-- memcached filter -->
<filter-mapping>
  <filter-name>memcached</filter-name>
  <url-pattern>/*</url-pattern>
</filter-mapping>
 
 
<filter-mapping>
  <filter-name>charEncodingFilter</filter-name>
  <url-pattern>/*</url-pattern>
</filter-mapping>
 
<filter-mapping>
  <filter-name>sitemesh</filter-name>
  <url-pattern>/*</url-pattern>
</filter-mapping>

MemcachedFilter has a dependency on a bean named “memcachedClient”. Please refer to this article for configuration instructions.

Now that the filter is configured and active we actually want to put it to use – which is quite easy. The filter scans all Responses it encounters for the presence of two special headers: “X-Memcached-Filter-Cache-Key” and “X-Memcached-Filter-Cache-Timeout” (the latter is optional). If the “X-Memcached-Filter-Cache-Key” is detected the filter will store the response content in memcached using the header value as key. The expiration timeout will be either a default value or the value of the “X-Memcached-Filter-Cache-Timeout” header. It’s that simple. Here’s a usage example:

?View Code GROOVY
class FooController
{
  def index =
  {
    if(!SecurityUtils.subject.principal)      // store response only for anonymous users (using Shiro plugin)
    {
      response.setHeader(MemcachedFilter.MEMCACHED_FILTER_X_CACHE_KEY, ConfigurationHolder.config.app_prefix + (request.forwardURI - request.contextPath))
      response.setHeader(MemcachedFilter.MEMCACHED_FILTER_X_CACHE_TIMEOUT, 120)         // cache for two minutes
    }
  }
}

This will store the response content for FooController’s ‘index’ action in memcached for two minutes using the context relative request URI as cache key as long as the requesting user is not logged in. It is of paramount importance that you never ever do this for authenticated users or bad things may happen. You don’t want John Doe to access information intended for the boss, only because the boss happened to be the first user accessing a certain page, do you? To make sure that our Frontend server will try fetch content from the cache for authenticated users we need a way for it to tell if a request is from an anonymous or authenticated user. To do this I wrote a simple Grails Filter that ensures that authenticated users are tagged with a cookie:

?View Code GROOVY
class CacheFilters
{
  CacheService questionCacheService
 
  final static String MEMCACHED_USER_IS_AUTH_INDICATOR_COOKIE = 'user_id'
 
  def filters =
  {
    /** Sets or deletes a cookie that signals nginx that the current user is authenticated - not used for permission checks */
    setNginxUserIsAuthenticatedIndicator(controller: '*', action: '*')
    {
      afterView =
      {
        def cookie = request.cookies.find { it.name == MEMCACHED_USER_IS_AUTH_INDICATOR_COOKIE }
 
        // is the user authenticated (by login or cookie?)
        if(SecurityUtils.subject.principal)
        {
          // yes, set the cookie if needed
          if(!cookie)
          {
            cookie = new Cookie(MEMCACHED_USER_IS_AUTH_INDICATOR_COOKIE, User.get(SecurityUtils.subject.principal).id.toString())
            cookie.setPath(request.getContextPath())
            response.addCookie(cookie)
          }
        }
 
        else
        {
          // delete the cookie if necessary
          if(cookie)
          {
            cookie.setMaxAge(0)
            cookie.setPath(request.getContextPath())
            response.addCookie(cookie)
          }
        }
      }
    }
  }
}

Rest assured that the sole purpose of the cookie is to signal the frontend server that a user is authenticated in order to disable any cache retrieval attempts. No security checks are ever performed against the cookie.

Now that we’ve taken care of storing the content, someone has to retrieve it. As already mentioned we are using Nginx as Frontend server. I’m a big fan of it because of its small footprint, great performance and integrated support for memcached. Below is a Nginx virtual host configuration that implements the remaining bits and pieces:

?View Code APACHE
upstream tomcats
{
  server 127.0.0.1:8080 weight=1;
}
 
server
{
  listen 80;
  server_name example.com;
  access_log /var/log/nginx/example.log;
 
  include ua_ban_list.conf;
 
  proxy_set_header Host $http_host;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
 
  default_type text/html;
  charset utf-8;
 
  proxy_redirect http://tomcats/ http://example.com/;
 
  location /
  {
    # only consider GET requests for caching
    if ($request_method != GET)
    {
      proxy_pass http://tomcats;
      break;
    }
 
    # never cache auth requests
    if ($request_uri ~* (^/auth/.*$))
    {
      proxy_pass http://tomcats;
      break;
    }
 
    # detect cookies that indicate an authenticated user
    if ($http_cookie ~* "(rememberMe|user_id)")
    {
      # don't try cache lookup for authenticated user - NEVER
      proxy_pass http://tomcats;
      break;
    }
 
    # compute cache key from app prefix + request_uri
    # if they computed key does not match the key computed by the application
    # server when storing content we will get nothing but memcached misses
    # and NO speedup
    set $memcached_key example_$request_uri;
    memcached_pass memcached;
    error_page 404 = @cache_miss;
  }
 
  location @cache_miss
  {
    internal;
    proxy_pass http://tomcats;
  }
}

Please pay special attention to the line where the cache key is computed. A key prefix “example_” is used. To get the example to work you would have to add a config variable ‘app_prefix’ to Config.groovy which needs to have the value “example_”.

Finally I should point out that this approach isn’t tied to Grails applications. Basically any Java Web application could use it if someone would port the MemcachedFilter from Groovy to Java which should be a piece of cake considering that it was developed from Java Sources in the first place.

That’s it for now. I hope you enjoyed the ride.

Leave a Reply