|
|
YOUR FEEDBACK
Did you read today's front page stories & breaking news?
SYS-CON.TV |
TOP THREE LINKS YOU MUST CLICK ON Acid Reign
Is the Glass Half Full or Half Empty?
Threads, amdahl, and a very big machine
By: Peter Holditch
Feb. 18, 2006 10:30 AM
Digg This!
In this column over the years, I have spent a considerable amount of time talking about contention and locking in the database tier. At the end of the day, the endless conversations about scaling the application tier boil down to less than a bag of beans if a scaled application can't go any faster because the database has hit a limit.
This is all well understood stuff, at the database tier. A lot of effort has gone in over the years to remove lock contention wherever possible. In particular, optimistic locking has become a well-established technique to reduce data contention. With optimistic locking, the database will give its locks out to multiple accessors simultaneously, on the assumption that they will not make conflicting updates to the data. At the time the transaction commits, if it turns out the two accessors did actually trample on each other, one of the transactions will be rolled back with an exception, since it has turned out that the database's optimism was unfounded and the access really needed to be serialized in the more traditional pessimistic way. The success of this technique in increasing throughput is not magic; clearly, it works because what is being contended for is only the lock - not the data itself (this being the key optimistic assumption). If the actual data were being contended for a majority of the time, optimistic locking would actually result in a net slowdown, brought about by the endless collisions and retrying. So why am I rambling at high speed through this resume of transaction design and database lock management?
"You cannot change the laws of physics!" So, you say, apps are full of singletons; Java in general and Java application servers in particular have a capacity for running lots of threads through them, so why hasn't every highly threaded java application ground to a halt long ago? Amdahl's law (the particular "law of physics" in question), after all, states that you can only parallelize an application to the degree that the threads can be run independently - what gives! The answer turns out to be in the hardware that is running the software virtual machine. In a typical environment, eight CPUs constitute a pretty large machine, and it is more common to have deployed many smaller servers, with one or two CPUs each. Therefore, in this environment, if you code a Java application that spawns a number of threads that all access a single hashmap as part of their work, you will not see much difference to the throughput if you compare it with and without the hashmap as you increase the number of threads from, say 50 to 5,000. The reason for that is that even if you spawned 5,000 threads, they are still getting scheduled over the eight CPUs (or however many you have), so the level of parallelism you are achieving is much less than you expected - only eight threads can really physically run at once. This points up a fundamental scaling issue with Java on traditionally architected machines. The "thing there is not much of" - the place where contention will limit your throughput - is not the hashmap, it's the CPU itself. This is one of the reasons why Java applications typically end up deploying over a pretty large set of machines - that way you get enough CPUs to really physically run things in parallel and have a hope of meeting your throughput goals and other SLAs. It is a shame about the running costs of the resulting complex, partitioned system. This latter partitioning problem is one of those that Azul is addressing by manufacturing 384 processor SMP boxes to allow truly big Java virtual machines to run - removing the complexity of having to artificially break the application up into small pieces. "AHA!" you cry - now that lock thing in the singleton will kill you! You now have many truly concurrent threads all contending for those singletons. It turns out that there is a solution to this problem supported in the hardware (here's an advantage of designing hardware explicitly to execute virtual machines), and it's called Optimistic Thread Concurrency or OTC for short. As the name suggests, the system's behavior using OTC is the same as (or at least analogous to) the behavior of a database doing optimistic locking: the virtual machine allows many threads to pass through a lock on the assumption that bad things won't happen (that's the optimism part). If bad things do happen and data (rather than just the lock) is actually contended for, then the contending thread is reverted to the state it was in before it acquired the lock and it proceeds forward again - none the wiser, with the integrity of the data intact, and all with no change to the application code itself. The trick is knowing which locks to be optimistic about - if a particular lock protects a piece of data that is nearly always written to, all that rolling back will hurt, not help, performance. In other cases, for data that is read very frequently but seldom written such as reference data, optimism will be the best policy. In order to tune the OTC system for these different eventualities (which will doubtless coexist in a single app) there are three types of lock that are implemented internally by Azul's virtual machine. Thick locks are the usual pessimistic locks through which access will be serialized, and thin locks are the cheapest form of lock. If a thin lock experiences contention, it must be promoted to either a thick lock or a speculative lock, which is the type of lock that sits between thin and thick and allows multiple threads through, using the hardware to check that there is no contention, and, importantly, to drive the rollbacks if contention occurs. Throughout the life of a virtual machine, data is kept about all the locks. Locks that are frequently contended will be thick, those that are never contended will be thin, and those that are "mostly" uncontended will be speculative. As usage patterns change, heuristics in the VM move individual locks between these three states, thus providing optimum parallelization, and hence throughput, for the application at run time. Not only does this scheme just improve parallelism with no code changes, but it also allows simpler code to be used. Often, programmers in search of the ultimate throughput end up coding complex nested locking schemes, which are difficult to code, and even more difficult to debug and maintain. With OTC, the locking can be coded in a coarse-grained manner (and therefore can be simple to code and maintain) without impacting throughput - for once the developers get to have their cake and eat it too. So, fill your glass with OTC. You may find that that old glass of yours holds more than you think, if you deploy it on Azul! For more details about OTC, you can download the whitepaper from the Azul Web site, at www.azulsystems.com/products/whitepaper_abstracts.html.
BEA WEBLOGIC LATEST STORIES
SUBSCRIBE TO THE WORLD'S MOST POWERFUL NEWSLETTERS SUBSCRIBE TO OUR RSS FEEDS & GET YOUR SYS-CON NEWS LIVE!
|
SYS-CON FEATURED WHITEPAPERS MOST READ THIS WEEK BREAKING NEWS FROM THE WIRES
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||