Friday, May 23, 2008

Double Checked Locking

I still get a lot of questions about whether double-checked locking works in Java, and I should probably post something to clear it up. And I'll plug Josh Bloch's new book, too.

Double Checked Locking is this idiom:

// Broken -- Do Not Use!
class Foo {
  private Helper helper = null;
  public Helper getHelper() {
    if (helper == null) {
      synchronized(this) {
        if (helper == null) {
          helper = new Helper();
        }
      }
    }
  return helper;
}

The point of this code is to avoid synchronization when the object has already been constructed. This code doesn't work in Java. The basic principle is that compiler transformations (this includes the JIT, which is the optimizer that the JVM uses) can change the code around so that the code in the Helper constructor occurs after the write to the helper variable. If it does this, then after the constructing thread writes to helper, but before it actually finishes constructing the object, another thread can come along and read helper before it is finished initializing.
See the double-checked locking is broken declaration for a much more detailed explanation.

Anyway, the question I always get is "does making the helper field volatile fix this?" The answer is yes. If you make the helper field volatile, the actions that happen before the write to helper in the code must, when the program executes, actually happen before the write to helper — no sneaky reordering is allowed.

Having said this, there are often much better ways of lazily initializing a singleton. One of the items in Josh Bloch's new revision of Effective Java (lgt amazon) deals exclusively with this problem. This book is chock-full of useful Java knowledge, and is highly recommended for all Java programmers. He has revised it for Java 6; if you only have the old edition, pick this up, because he deals with things like generics, the new threading stuff in JDK5, autoboxing, and how these things all work together to make your life unpleasant.

Sunday, May 11, 2008

Which Lock is Which?

I was at JavaOne last week, and attended Bill Pugh's Defective Java talk. Bill had an important point about what kind of locks you want to be using and what kind you don't, and that point is worth repeating.

Bill is, with Dave Hovemeyer, the guy behind FindBugs, which everyone should be using religiously. He was also my graduate advisor; we worked together on the new Java Memory Model. If there are two things Bill knows, they are concurrency and bug patterns.

There is a tremendously useful defensive locking discipline that I use all of the time, and recommend that other people use, too:

class Foo {
private final Object lock = new Object();
private Bar lockProtectsMe;
...
public void blah() {
synchronized (lock) {
// accesses LockProtectsMe...
...
}
}
}

This has the following benefits:
  • If you try to acquire the lock on the Foo object instead of the lock, you run the risk that some code outside the class obtains that lock. Perhaps forever.

  • Using the Foo object also has the downside that it is coarse-grained locking; if you use that object to protect some other field called, say, lockProtectsMeToo, then you won't be able to access both lockProtectsMe and lockProtectsMeToo at the same time, even if they have nothing to do with each other.

  • If you try to acquire the lock on the lockProtectsMe object, you run the risk that someone changes that field. Since you acquire locks on objects, not on fields, if this happens, you will be acquiring a completely different lock.

There are reasons not to use this pattern, of course, and if you are curious, I can fill you in (ask in the comments), but that's not really the point of this post.

Anyway, Bill noticed that quite a few people were using the following pattern to lock objects:

private final String myString = "LOCK";
private Object lockProtectsMe;

...
public void blah() {
synchronized (myString) {
// accesses lockProtectsMe
...
}
}

This is awfully similar to the pattern above, but should never be used. Why? It turns out that when you make a String final and declare it in the source code, that String will be interned. This means that there will be one canonical instance of that String for the entire JVM.

Now, if you are likely to use this pattern in one place, you are likely to use this pattern in lots of places...

class Bar {
private final String lock = "LOCK";
...
}

class Baz {
private final String lock = "LOCK";
...
}

All of these lock fields will be interned. They will all share one common instance. If you run the code:

class Bar {
static final String lock = "LOCK";
}

public class Baz {
static final String lock = "LOCK";
public static void main(String [] args) {
System.out.println(Baz.lock == Bar.lock);
}
}

It will print out true, because, as a result of the interning, those will be the same object. Since we synchronize on objects, not fields, any code that synchronizes on them will be using the same lock. This is potentially disastrous; imagine if you had this code somewhere:

Thread 1:

synchronized (a) {
synchronized (Baz.lock) {
// ...
}
}

Thread 2:

synchronized (Bar.lock) {
synchronized (a) {
// ...
}
}

We've now turned perfectly inoffensive code into a potential deadlock. And that's an ugly one, because it isn't going to be obvious why it happened.

You should be equally careful with synchronizing on autoboxed / unboxed types. The reason for this is similar, and left as an exercise for the reader.