[ Team LiB ] |
3.2 ConcurrencyTransaction management is a big enough problem worrying only about consistency, atomicity, and durability. Perhaps one of the biggest issues is isolation—how do you manage the concurrent access to database resources by many different users? The mechanism your database uses is generally hidden from the software architect and database programmer. A data architect, however, should be familiar with how their database of choice manages isolation. 3.2.1 Isolation LevelsBefore talking about the various mechanisms to support transaction isolation, it is useful to examine the concept in more detail. As I mentioned earlier, isolation is the I in ACID. When a second transaction attempts to read the data being modified by a first transaction, that second transaction will not see any changes the first transaction has made. For example, if my wife read a joint savings account balance at the same time I was transferring money from savings to checking, the following events might occur:
My wife should see $1,000 as the balance in the savings account even though the system has already debited the savings account. She reads the old value because the transfer transaction is in an inconsistent state at the time of her request. So far in this chapter, we have looked at transaction isolation only in the context of full transaction isolation—where a transaction can act on the database as if it were the only thread operating on that database. In the real world, databases must support concurrent access by multiple transactions. Business applications in particular have significant concurrency demands. Because of the performance penalty of full transaction isolation, it is rarely practical for most systems. ANSI SQL identifies four distinct transaction isolation levels that enable you to balance performance needs with data integrity needs. In order to understand the different transaction isolation levels, however, you should first understand a few terms relating to the interaction of concurrent transactions:
The four ANSI SQL transaction isolation levels are:
Assuming your database engine supports these levels of transaction isolation, a Java database application can control its transaction isolation level through the JDBC Connection interface using the setTransactionIsolation( ) method: Connection conn = dataSource.getConnection( ); conn.setTransactionIsolation( Connection.TRANSACTION_READ_COMMITTED); // perform your transaction To find out what transaction isolation levels your database engine and driver support, use the supportsTransactionIsolationLevel( ) method in DatabaseMetaData. Table 3-1 shows the transaction isolation constants in java.sql.Connection. You may also want to check whether the database supports transactions at all (mSQL, for example, does not support them). The method supportsTransactions( ) answers this question.
As an application architect, your job is to balance database integrity with application performance. You therefore want to select the lowest level of transaction isolation that will protect the integrity of the data as you have it structured in your data model.
3.2.2 LockingThe way in which your database engine manages concurrency is specific to your database engine. As long as it maintains ACIDity, it does not matter from a standards perspective how it does it. Nevertheless, database engines all use some form of locking. Conceptually, concurrency at the database level can be understood using Java as an example. Java, of course, uses locks on objects to manage concurrent access to synchronized blocks of Java code. Consider the following three methods: public class MyClass { public synchronized void first( ) { // do something } public void second( ) { synchronized( this ) { // do something else } } public void third( ) { synchronized( otherObject ) { // do yet a third thing } } } If two threads call first( ) at the same time, one will be allowed to execute the method to completion before the other can start. This sequencing occurs because the synchronized keyword indicates that you are locking against a monitor. In the default case, the monitor is always this. Thus the monitor in first( ) and second( ) is this and the monitor for third( ) is otherObject. Only one thread at a time can execute code monitored by a common object. In this example, first( ) and the synchronized section of second( ) cannot be executed concurrently by multiple threads. However, first( ) and third( ) can be concurrently executed by two separate threads since they are guarded by different monitors. Databases use a different concurrency mechanism, but the Java concepts are relevant. A database allows for locks to be taken on different scopes of data. A database can lock a table at any of the following levels:
Depending on your database engine, your lock level can also be modified as read, update, exclusive, shared, or any other number of attributes. SQL Server, for example, supports the following kinds of locks:
You need a nice spreadsheet and an understanding of your database of choice to appreciate how two kinds of locks interact. For example, many different threads can acquire a shared lock. These shared locks may—depending on the database engine—prevent another thread from acquiring an update lock until the shared locks are released. Because interaction varies by database, a discussion is beyond the scope of this book. The easiest lock level to understand is a table-level lock. In this case, only one thread can touch the locked table during the lifetime of the lock. This approach is much like using MyClass.class as the monitor for access to all methods in your MyClass class. In other words, no two threads would be able to access any MyClass instance concurrently. As you can imagine, such a trick would not perform well in a Java application. Similarly, table-level locking performs horribly for databases. Page-level locking has no Java analogy. In short, it prevents concurrent access to all of the rows in a single page of memory. If, for example, you have rows 1-3 in one page of memory and 4-6 in another, access to data from rows 1-3 with page-level locking prevents other threads from accessing any of those three rows. A concurrent thread can, however, access the data in rows 4-6. Row-level locking is like making every nonstatic method in a Java class synchronized. It prevents concurrent access by two threads to the same row in the database. Two threads may, however, access two different rows concurrently. On one hand, locking is necessary to guarantee proper transaction isolation in a concurrent environment. On the other, locking slows down database applications. Any transaction that has a long duration and a large number of resources associated with it is going to drag down any application. Later in the chapter, I introduce some approaches to concurrency that enable you to strike a balance between isolation and performance. |
[ Team LiB ] |