September 22, 2010

Introduction to Drupal Caching and Performance Optimization

Because Drupal is a Content Management System, most of its data lives in a database and not in actual physical files like non CMS systems do.  Under heavy load this type of system can become slow because doing reads and rendering pages from a database is slower than doing reads and rendering pages from files.

This is where caching comes in.  From Wikipedia, cache “is a component that improves performance by transparently storing data such that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If requested data is contained in the cache (cache hit), this request can be served by simply reading the cache, which is comparably faster. Otherwise (cache miss), the data has to be recomputed or fetched from its original storage location, which is comparably slower. Hence, the more requests can be served from the cache the better the overall system performance is.”

In a non-cached environment the workflow goes something like this:

  1. You save a piece of content that gets stored in the database
  2. A visitor hits a page on your website, Drupal queries and database to return menus, blocks and content
  3. The web server serves up the content and the visitor is presented with the page
  4. Each time a visitor hits the page steps 1-3 are taken

In a cached environment, the workflow changes so that:

  1. You save a piece of content that gets stored in the database
  2. A visitor hits a page on your website, Drupal queries and database to return menus, blocks and content
  3. If caching is turned on, the first time this page is hit the caching system creates and stores a physical file version of the page based on what was returned from the database (creating the cached version of the page)
  4. The web server serves up the content and the visitor is presented with the page
  5. The second and subsequent times a visitor hits the same page, the cached version of the page will be rendered

For our high load websites we are using a combination of various caching technologies to maximize performance.  These include:

APC – a free, open, and robust framework for caching and optimizing PHP intermediate code.

Authcache – a caching system for both anonymous users as well as authenticated users who are logged in.

Memcached – a caching daemon designed especially for dynamic web applications to decrease database load by storing objects in memory.

CacheRouter – a caching system for Drupal that allows you to set individual cache tables to various cache technologies.

You can mix and match these technologies based on the needs of your site.  Other tricks to improve the performance of your Drupal site include tweaking the Drupal performance settings in admin/settings/performance:

  • Enable page compression
  • Enable block cache
  • Optimize CSS files
  • Optimize Javascript files

Additional Resources

Drupal High Performance Group
This group is dedicated to solutions and approaches for high traffic, high performing Drupal sites. As such, it will deal with a lot of information around the rest of a typical Drupal “stack” — the operating system, web server, database, and PHP tweaks that combine to support the Drupal application.

Drupal Server Tuning Considerations

Comments are closed.