Mohawksoft.org represents the open source projects at Mohawk Software. Mohawk Software creates applications and solutions for various businesses and in the process of doing so, develops technology and utilities that have applications broader than the specific task for which they were designed.

Rather than let this code languish unused, Mohawk Software makes it available to the open source community with either GPL or LGPL licensing.

User login

MCache Web Clusters

warning: include(/tmp/fortune.txt): failed to open stream: No such file or directory in /home/mohawksoft/org/www/htdocs/includes/common.inc(1696) : eval()'d code on line 1.
warning: include(): Failed opening '/tmp/fortune.txt' for inclusion (include_path='.:/usr/share/php:/usr/share/pear') in /home/mohawksoft/org/www/htdocs/includes/common.inc(1696) : eval()'d code on line 1.

MCache: Web Session Clustering

By Mark L. Woodward

February 27, 2007

MCache is sort of unique in what it does. It was developed for a specific class of problem that is ill served, a shared cache system that isn't necessarily tied to a specific data source. To understand what MCache does, you need to understand the construction of a classic high performance web site.

A typical high volume website design is based on a logical separation of tasks in to multiple levels and servers, where each successive level gets more specialized and thus harder and/or more expensive to replicate.

Load Balancer

The load balancer accepts requests from the internet and by using various algorithms, directs the request, invisibly to the web browser, to a web server. It could be a Local Director, an Alteon, or even a Linux LVS system (http://www.linuxvirtualserver.org).

Web Server - Tier-1

The web servers are just any old web servers, IIS, Apache, etc. They could either be stand-alone or in a cluster, they are designed to not know or care how they are configured. The same web code that runs on one machine should be able to run across 100 machines.

The purpose of web servers are to provide presentation and incidental functions for the end user.

SQL Database and Application Servers - Tier-2

The back-end servers that provide that actual purpose of the site. They perform banking transactions, record the contents of your shopping cart when you order, etc.

Classic N-Tier Model

An N-Tier model, as shown in the above web site example, is designed so that the higher the level, the less critical and more plentiful the resource. It is the job of higher levels to isolate the lower levels and perform work that is more easily distributed. This design works best when the amount of work at the lower levels is much reduced from that of the higher levels. An architecture may have more tiers than simply two, but each additional tier can add latency to requests so there is often a point of diminishing returns. Also, items like application servers may, themselves, be an N-Tier system.

While this is typically called “scalable,” the design is limited by the utilization of the lower tiers. A proper architecture factors this in by caching data acquired from a lower level in the level requesting the data. In the above example, the web servers should cache as much data as they can at the web server level and only request data from the lower level when absolutely needed.

False Scalability

The big problem with this design is false scalability if used incorrectly. Many web designers think of databases and business logic servers as sources of infinite performance. When ever they need data, they think nothing about hitting a lower level. When they want to store data, no matter how temporary, they use the database without a second thought.

Unfortunately, the resources are not infinite. No matter how many web servers you have, if you don't regulate or limit access to the lower levels, your site will fall apart under heavy load. Sure, you can add all sorts of additional servers but that just adds cost and complexity to your system. Proper design reduces complexity and cost and increases reliability by removing unneeded systems.

The unfortunate truth is that any coherent system, no matter how scalable, is limited by some logical bottleneck. A good design balances the tiers as best as possible, but regardless of what you do, every design has a limit. Scalability is not something you can add after the fact with extra machines, it is something that happens BEFORE you start coding.

Sessions and Performance

Web server environments like Java, IIS, and PHP all support the notion of a web “session.” Put simply, a web session is a block of data that hangs around from one HTTP request to another. Using web sessions can dramatically improve site performance and should not be underestimated.

In the PHP environment, a session is represented by an alphanumeric string of random characters. The string is used to access data stored locally on the server. If you use the PHP defaults, this is typically a file stored in the “/tmp” directory. The session reference string is stored as cookie in the user's web browser, each time the user returns to the site, the cookie is sent, the session data is retrieved from the file in “/tmp” and the web service can continue on from whence it left off.

By putting data in the session that otherwise would have had to be retrieved from a database or business logic server, you reduce the load in the lower tiers. This should ultimately allow your site to handle more traffic. The session file acts like an elementary caching mechanism.

Cache Coherency

Nothing comes for free. Once you start using sessions as a cache, you run into another problem, cache coherency. Put simply, cache coherency is ensuring that what is in cache accurately represents the data it is supposed too.

With a single machine, this is easy, the session file is most likely correct. If you have multiple web servers and each web server has its own copy of the session, who has the accurate copy? The answer is: you don't know. I believe this is why most web developers fail to create web sites that scale properly. Since the “session” systems don't seem to work across multiple machines, they just ignore the session system and use a SQL database. The architect should not allow this, and there are several options to get a number of web servers to implement a single site with coherent session information.

Sticky Sessions

The easiest solution is the notion of “sticky sessions.” Most load balancers can “pre-process” and understand HTTP requests and can be set up to recognize various cookie variables or TCP/IP addresses and be configured to send repeat sessions to the same servers.

This is a really simple solution to a potentially difficult problem. However, it has a few issues: it limits the ability of the load balancer to properly balance the site because it ties a particular user or group of users by TCP/IP address to a specific server, it creates extra load on the load balancer having to parse the HTTP requests, and if anything happens to the web server (i.e. it gets hacked) those users lose their session data. As capacity of the site increases, the load balancer can become the limiting system or “bottleneck.”

Network File System Based Sessions

The next solution is the most logical one, use a network file system to share session files across multiple machines. Any web server in the cluster can use any session, leaving the load balancer to perform properly and the load balancer doesn't have “pre-process” the web request.

The problems with network file systems are many, and NFS in particular has a couple issues: NFS' file locking is inadequate and can allow corruption of the session data. NFS is slow and generates a lot of network traffic and can impact the back end networking infrastructure.

Database Session Table

The next common solution is using a database to store session information. (Since most web sites have a database, this is usually easy to do.) Although it isn't as scalable as sticky sessions or as simple as using an NFS share, it doesn't share their problems either.

The problem with database session managers is the database itself. Databases just aren't intended to hold simple temporary “cache” data. They can, of course, but it is not the best utilization of resources. A good database is a designed to preserve data integrity at all costs. They are designed to analyze and optimize queries and use the best methods to create relational joins. The amount of overhead for the simple “select” and “update” of a simple session table is like using a sledge hammer to kill a fly.

What's even worse are the characteristics of database performance in a highly volatile data environment. There are basically two types of database, three if you count the really broken ones, the first are the simplistic databases that use locking mechanisms during access, and the second, the more advanced ones, that use some sort of MVCC (Multi Version Concurrency)

Locking Databases

A locking database, something like MySQL, is one in which all requests to the database result in some sort of lock. A SQL “SELECT” results in a read lock on the table. Any number of SELECTs may be running at a time, but while they are, no INSERTs, UPDATEs, or DELETEs can. When an INSERT, UPDATE, or DELETE is being executed no SELECT statements (or INSERTs, UPDATEs, and DELETEs for that matter) can be run.

These types of databases are a very bad choice for session managers because they tend to perform worse when they are needed most. The higher the volume of traffic to your site, the more table locking gets in the way.

MVCC Databases

An MVCC database doesn't lock. They typically use “row versioning.” With a locking database, when an update is executed, the table is locked, the row to be modified is found, the data is modified and saved back to disk in place of the old data, and then the table is unlocked. With an MVCC database, the latest version of the row to be modified is found, it is modified and saved to the disk as a new version of the row. The old version and values remain. No locking happens.

MVCC databases perform much better in a volatile data environment than do locking databases, but they are not perfect. The PostgreSQL database, for instance, has a pronounced problem in a web session scenario. Since previous versions exist simultaneously with newer versions, each query must search through older versions to find the correct version. Each UPDATE made to a row increases the amount of work required to find the correct version of the row for the next query. For PostgreSQL, the VACUUM command updates the internal database structures so that they point to the most recent version, but again, the next update causes a new version of the row to be added, again adding additional processing to find the latest version of the row.

To see this behavior in PostgreSQL, create a table, insert a row. Now, write a quick program to update the row about 10 or 20 thousand times. By the end of it, you'll see a lot of disk activity and a huge reduction in performance. Now, update the row once more and note how long it takes and how much disk access happens. Run the VACUUM command. Now try to update the row again. You'll see performance has been restored.

While this seems like an extreme case, and in some ways may be, but given a busy site with a few thousand active users, each getting between 5 and 10 pages, on average, per session, this could easily result in dramatic session table clutter and slowdown in PostgreSQL. In a busy site, you will need to run VACUUM quite frequently.

While PostgreSQL has a more pronounced pathological behavior, most MVCC systems suffer the same problem only they hide it better. The tell tale sign is disk I/O.

False or “Phantom” Updates

Regardless if you use one or one hundred web servers, phantom updates will get you if you are not careful. What is a “phantom update?” Imagine your website is built using HTML frames, your top level FRAMESET calls out three FRAMEs. The user hits the top level page, the browser parses the FRAMESET, and issues three simultaneous HTTP requests, one for each frame page. Each HTTP request gets its own process or thread on the web server (or servers), each gets the same session cookie, and each retrieve the same session information. When each completes their task, they save their session state and finish.

That may sound like the correct operation, but it is broken and can lead to lost data or data corruption. Imagine each page increments a counter in the session. In PHP this may look something like this:

$_SESSION['count']++;

When the three frame pages are fetched, they each get the same copy of the session. Let's assume the value of 'count' is “1.' Each process sees a value of “1,” each will increment their value, and each increment will result in a value of “2.” At the end of their processing, each process will save its session data. The result in the session will end up being that of the last web process to finish. The value of 'count' will be “2.” This is, in fact, an error. Starting with “1,” each of the three frame pages incrementing a counter should result in “4.”

What happened? This is the phantom update problem and it occurs when data is modified without regard to other processes who may also be modifying the data. If you were writing a multi-threaded program you'd use a mutex, semaphore, or critical section to serialize access to the data. Surprise! web pages can get bit by the same problems as applications. The PHP “mod_files” session module handles this by locking the session files with flock() in the “/tmp” directory.

This is another bug that a lot of web sites have but don't even know it. Many simply use a SQL database to handle the session information assuming that the database will handle the problem unaware that they need to lock session information. It isn't the database's fault, to properly handle sessions using a SQL database you need to use something like “SELECT * from sessions where session_key = 'foo' for UPDATE” which will lock the row from other processes.

Shared Variable Access

The previous “Phantom Update” problem alludes to a much larger problem that seldom occurs to the web developer or even some of the more experienced architects. A web site, especially one with multiple web servers, shares alls of the problems of a parallel processing environment where just about all non-local data accessed by the code has the potential to be accessed simultaneously by 2 or more processes.

This problem is deeper and more insidious than may first appear. Take, for instance, the common “internet poll.” Some number of options will be presented for a vote. For the sake of example lets use three types of beer: Sam Adams, Bud Light, or Pabst Blue Ribbon.

The straight forward approach is almost always wrong. Frighteningly, the following pseudo code is actually implemented in a few CMS systems I have seen:

Step 1, get the poll information from the database with a simple query that returns all the poll items and their count.

$results = sql->query(“SELECT * from poll_table where poll_name = 'beer_poll'”);

Step 2, increment the counter that represents the user's choice.

$newcount = $results[$selection]['count']+1;

Now Set it back

sql->update(“UPDATE poll_table set count = $newcount where poll_name = 'beer_poll' and item = '$selection'”);

What happens here is the “phantom update” all over again. Any number of users could retrieve the same data, increment the same item, and update the table and lose other poll results. Last one to execute the update wins.

The only way to implement this poll in an active website, using a SQL database, is to use a locking mechanism like “SELECT ... FOR UPDATE” followed by the UPDATE and a COMMIT. Not even the common “UPDATE poll_table set count=count+1 where poll_name='beer_poll' and item ='samadams'” works unless your transaction is set for “serializable.” (The flip-side of the “phantom update” is the “phantom read” It is basically the same problem, only seen from the perspective of the reader instead of the writer.)

MCache

MCache is designed to address many of the challenges of a high volume website. It is sort of a cross between a session manager, database server, and distributed lock.

Cluster Web Session Manager

The first thing MCache can do for you is make setting up two or more web servers behind a load balancer a snap. By using the MCache extension for PHP, multiple web servers can share and access session information seamlessly with little or no modification to your web code and with no fussing with load balancers, NFS file servers, or SQL databases.

MCache takes care of “phantom updates” and “cache coherency” at the session level with no additional coding by the web application.

Improve Performance with Caching

With a single web server or a moderate sized cluster of web servers, MCache improves performance by maintaining an in memory cache of session information.

With multiple web servers MCache provides improved overall performance by reducing load on a back end database and MCache typically out performs SQL databases on the same hardware because it is not a general purpose application, it is designed to perform a very specific number and type of operations.

Manages Shared Variable Access

Shared variables are one of the more difficult constructs in a web site, and are easily managed with MCache. Simply by using the MCache API, almost all the concurrency issues one needs to deal with are handled by MCache. For instance, the previous “beer_poll” example can be implemented in PHP this easily:

$pcreated = mcache_getdata('beer_poll');

if(!$pcreated)

{

# Create poll object if not there

mcache_create('beer_poll', 'poll_class', '');

mcache_setdata('beer_poll', 'created');

}

# Voting in Poll;

if($selection)

mcache_inc('beer_poll', $selection);

#display poll

$polldata = mcache_get_array('beer_poll');

display_var("Beer Poll", $polldata);

The above code creates a “beer_poll” object, and then facilitates voting. It really is that easy, and it works, and it works on one or many machines quickly and reliably.

Sharing Across Application Bases

While not a completely new or unique idea, MCache can facilitate the sharing of data across application code bases in much the same way that a SQL database would. While PHP is well supported, the API is provided as a set of C function calls. Almost any application that can use a C library can use MCache.

Mohawksoft.Org

Who's online