Friday, April 25, 2008

Immutability in Java

Another topic that comes up again and again in questions that people ask me is, "How does immutability work in Java"? Immutability is a godsend for concurrent programmers, because you don't have to do lots of sticky reasoning about what threads are updating what variables when, and you don't have to worry about cache thrashing, and you don't have to worry about all sorts of things. When I write concurrent code (which is reasonably often), I try to make as many things immutable as possible.

Now, in common parlance, immutability means "does not change". Immutability doesn't mean "does not change" in Java. It means "is transitively reachable from a final field, has not changed since the final field was set, and a reference to the object containing the final field did not escape the constructor".

In circumstances other than this, even if a given field is not mutated, the Java memory model requires that there be some form of synchronization (which can include the use of volatile, static initialization, synchronized blocks, any of the java.util.concurrent collections, or the use of a java.util.concurrent.atomic.AtomicFoo object) for a thread to make sure that it sees the correctly constructed object for the first time. Subsequent reads of the object by any given thread don't require additional synchronization.

So, a correctly written version of HashMap that was immutable and thread-safe would look like this:

public class ImmutableHashMap<K, V> implements Map<K, V> {
private final Map<K, V> map;
public ImmutableHashMap(Map<K, V> map) {
this.map = new HashMap<K, V>(map);
}

@Override
public V get(Object key) {
// And similarly all other accessors
return map.get(key);
}

@Override
public V put(K key, V value) {
// And similarly all other mutators
throw new UnsupportedOperationException();
}
}

ETA: This is how Collections.unmodifiableMap() works.

Because of the special meaning of the keyword "final", instances of this class can be shared with multiple threads without using any additional synchronization; when another thread calls get() on the instance, it is guaranteed to get the object you put into the map, without doing any additional synchronization. You should probably use something that is thread-safe to perform the handoff between threads (like LinkedBlockingQueue or something), but if you forget to do this, then you still have the guarantee.

There are two major points to make about this kind of "immutability":
  1. It's not immutable. So, I've completely misled you with mutable immutability. The following code is perfectly legal:

    HashMap<Integer, StringBuilder> map =
    new HashMap<Integer, StringBuilder>();
    StringBuilder builder = new StringBuilder();
    builder.append("foo");
    map.put(1, builder);
    ImmutableHashMap<Integer, StringBuilder> immutableMap =
    new ImmutableHashMap<Integer, StringBuilder>(map);
    builder.append("bar");
    System.out.println(immutableMap.get(1));

    I think we all know that that println() method is printing "foobar", not "foo". So, even if we call this an "immutable" hash map, values (and keys!) can still be mutated. This is a bad idea, of course. Other threads are not guaranteed to see the updates you make to immutable objects (at least, not without additional synchronization).

  2. The final field is absolutely necessary for the thread-safety guarantee. I recently saw an implementation of an ImmutableHashMap that looked more like this:

    public class ImmutableHashMap<K, V> extends HashMap<K, V> {
    public ImmutableHashMap(Map<K, V> map) {
    super(map);
    }

    @Override
    public V put(K key, V value) {
    // And similarly all other mutators
    throw new UnsupportedOperationException();
    }
    }
    This has the great virtue of avoiding the extra indirection of the delegation-based version, and also has the great virtue of being shorter (because you don't have to rewrite all of the accessors). The flip side is that if you share instances of this ImmutableHashMap with other threads, then you absolutely have to use synchronization, because it does not get the special guarantees that the final field provides. If you call get(), you can actually get the wrong value out. It isn't likely to happen in practice right now, but compiler writers are allowed to take advantage of this.

So, the moral of the story is:
  • Use final fields whenever you can, and

  • Immutability is a funny thing.

That's all I wanted to say.

If you liked this post, read the followup: Immutability in Java, Part 2.