We've got to maintain a certain level of 'street-cred'.

Memcahed and Session Performance

Our Revenue Forecasting project hit some performance problems in the last release. The problems are not exactly new but have been building as new features get implemented. In the spirit of not optimizing too early we let it go until it became obvious something had to be done.

As we added new features more data had to be retrieved for many of our pages. Not only were we retrieving new kinds of data but the existing data continues to grow with use. This has been placing more and more load on the database and on the code processing the data to return to the users. This project has been built on CakePHP since early 2007 and we have used Cake's builtin cache functions to help where needed in the past.

The first thing you should do is cache ALL bottleneck points. Cache at different layers as well. We have many controller functions and views that rely on some of the same data so we cached the view results, cached the controller processing the data where reuse was available, and cached model results where other reuse was feasible. This way the view results can be obtained instantly if already cached. If it is not then it can hopefully use already cached preprocessed data or at the least use cached model data. While layering caches can seem wasteful in some scenarios, you should actually do some testing of it as it can be deceptive. Implementing a cache of any of these layers should be trivial if your code is properly refactored.

The next step towards wringing more performance from your architecture is to implement a memory based cache system. We choose memcached for its stability and speed. The builtin file based caching in CakePHP just does not perform as well as most other solutions. It is great for simple caching and getting you going but eventually you will need to move to something faster as loads increase. You might do well with a ram backed filesystem and file caches but memcached was extremely easy to implement and CakePHP already has support for it since version 1.2.

Our system makes a lot of ajax calls on many of our pages along with custom chart and graph images. Sometimes this can be eight to ten calls upon page load not counting static resources. We noticed that on these pages we were not getting the performance we would expect especially when a cache miss occurred. This was from our use of PHP session handling. When the session file is obtained for a call it is locked to prevent multiple calls from changing the session in different ways and only one version persisting to disk. We do very few session manipulations after the user sings on to the application and so we moved the session handling to memcached as well. We understand that could lead to random evictions but in our case we use a different memcached server instance and it has served us well. You get the speed of memory access and the concurrency as well. You just have to pay particular attention to session concurrency management.

So what is the actual impact of these changes? Different users see different data and so it varies but the worst pages were loading in about 48 seconds and those pages are now loading between 3 - 15 seconds depending on the caches that are warmed up. Typical load times for heavily dynamic pages was around 8 seconds and is now 2 - 5 seconds. Even more importantly, the web server and the database server is spending much less time working on the current load and therefore can scale to many more users.