Sunday, September 03, 2006

Oracle Performance::Locks
How Oracle Uses Locks to Resolve Contention
Zeinab

Data Integrity

We use the term data integrity to denote the dependability of data. It is an overall measure of the quality of your data. It stresses the fact that the data should be both accurate and consistent.

In terms of multiple users, data integrity implies that
  • Changes made by one user do not overwrite changes made by another user based on the same set of criteria (i.e., to the same data) unknowingly.
  • Data returned to a user will represent a consistent set of data at any one point in time ("snapshot").
Perhaps the best feature of Oracle when compared to other databases is that it provides a 100 percent concurrency to its users all the time. What your users are getting is consistent data, all the time.

Concurrent User Integrity Problems

Ideally, every user would like to have the undivided attention of the database server ("isolation"). However, this is impossible in a data server, as its very reason for existence is to act as a central repository enforcing business logic and to share data among many users. Users are constantly accessing the resources (e.g., data) from a database. In a large system with multiple users, it may so happen that users are accessing the same type of resource. This might create different types of problems. There are three classic cases.
  • Lost Update
  • Nonrepeatable Read
  • Phantom Read
We shall look at these three cases.

Case 1: Lost Update


Fig 1: Lost Update
Fig 1: Lost Update

In the first case, the following things happen.
  • User A and User B both are interested in the same record (row) of data. A queries row 123 first.
  • At about the same time, but after A, B also queries the same row.
  • B updates the row and COMMITs.
  • After B has committed, A updates the same row and COMMITs.
  • The changes made by B are lost.
This is the case of the lost update. This is a common problem in OLTP systems using Java, VB, etc where the developer has to include code for handling this sort of a situation.

Case 2: Nonrepeatable Read


Fig 2: Non Repeatable Read

Fig 2: Nonrepeatable Read

To illustrate this case also, we make use of two users accessing the same type of data. Quite simply, it is the case of data being modified by another user so that summary calculations (such as the result of aggregate functions like average, sum, etc) are affected.
  • User A issues a SELECT query for a particular item for which the quantity ordered > 50. Row 123 is returned, as quantity ordered = 60.
  • User B reads the same row and updated quantity ordered = 40 and COMMITs.
  • Somewhat later, after performing some manipulations, A repeats his query, only to find that Row 123 is not returned (as its value has been modified).

Case 3: Phantom Read

A phantom read can occur when a user selects data with a limiting WHERE clause. Consider the following figure.

Fig 3: Phantom Read
Fig 3: Phantom Read

The following things happen.
  • A issues a SELECT statement with a where clause, and gets 998 rows. He performs some manipulations based on this result.
  • B now inserts a row 999 which qualifies for A's WHERE condition and COMMITs.
  • A issues his SELECT statement again. Now he gets 999 rows. His computations are possibly ruined.
Phantom reads and nonrepeatable reads could change the value of aggregate functions during the course of a transaction. This would seriously affect data integrity.

Locks

Locks are the resources used by Oracle to implement isolation in the database. Locks are used to isolate activity during a transaction and are automatically released when a transaction is concluded. there are basically two types of lock.
  • A shared lock is usually used wen you read data. Many shared locks can be placed on an individual row. Shared locks prevent the problem of non-repeatable reads.
  • Write locks, or exclusive locks, are used to prevent more than one user from writing to a row at a time. A database cannot issue an exclusive lock for a resource that already has a shared lock on it.
When you request data to read, a shared lock is used; when you write data, an exclusive lock is used. If a user is reading a row, no one else can update that row; similarly if someone is updating a row, no one else can update or read that row.

Contention and How Oracle Solves It

Locks, as we have seen, allow some users to access data while preventing some other users from accessing the same data. On an online database system, hundreds of users are interacting with the data, and there is constant contention for the resources. A system of locks can cause performance degradation when contention increases. Thus, we see that the two desirable qualities of an online database - number of users and performance - do not necessarily scale together. When usage increases, contention also increases, but performance degrades.

Oracle is perhaps the only database that has sorted out the issue cleanly and logically, through its Multiversion Read Consistency approach. This is how it works.
  • Whenever a transaction changes data in the database, Oracle creates a new version for the changed row.
  • Rollback buffer stores previous versions of rows.
  • Each row is marked with the SCN of the transaction that changed the values. The SCN is an absolute serial record of the transaction.
  • When a transaction is retrieving rows to satisfy a query, it checks the SCN of the latest version of the row. If the SCN is lower than the SCN associated with the (calling) transaction, the data from the row is retrieved. If higher, then the higher version row data is not used, and Oracle digs into the rollback buffers to retrieve an earlier version of the row with an appropriate SCN.
Multiversion Read Consistency approach gives each user a specific view of their data, at the point of time their transaction began. Hence, Oracle does not need read locks. This leads to two distinct advantages.
  • Write operations do not block read operations. The version of the data that has been read will be stored as a version in the database.
  • More importantly, read operations do not block write operations. A transaction can grab a write lock whenever it needs to, whether anyone is reading the data or not.
Okay, this means well, but as a developer you have a distinct advantage with Oracle. You need to spend less time writing code to deal with the possibility of contention, or its side effects. For this reason, Oracle, though horribly expensive, is perhaps the easiest and most developer friendly database to work with.

Though we can't very well dwell on the topic of how Oracle manages locks at this point in time, it suffices to say that Oracle does not manage locks in memory, as some databases do. This means that performance does not take a hit by the locking mechanism. All things considered, Oracle has an exceptionally robust and well-designed locking mechanism in place to allow as many users to have a meaningful and dependable experience on an online data system.