Sunday, February 7, 2010

Garbage Collection, [Soft]References, Finalizers and the Memory Model (Part 2)

In which I ramble on at greater length about finalizers and multithreading, as well as applying that discussion to SoftReferences. Read Part 1 first. It explains why finalizers are more multithreaded than you think, and why they, like all multithreaded code, aren't necessarily going to do what you think they should do.

I mentioned in the last entry that a co-worker asked me a question about when objects are allowed to be collected. My questioner had been arguing with his colleagues about the following example:

Object f() {
Object o = new Object();
SoftReference<Object> ref = new SoftReference<Object>(o);
return ref.get();
}
He wanted to know whether f() was guaranteed to return o, or whether the new Object could be collected before f() returned. The principle he was operating under was that because the object reference was still on the stack, the object would not be collected. Those of you who read the last entry will know better, of course. To put it in the words of the Java Language Specification, optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable.

In practice, what this means is that the VM can:
  1. Decide the variable o is no longer used after the last time it appears in the program,

  2. Clear the SoftReference immediately after it is constructed, and

  3. Return null

Those of you who read my entry on how Hotspot decides to clear SoftReferences will know that this is extremely unlikely in Hotspot, where a SoftReference always survives for at least one collection. But in other VMs? I have no idea.

Anyway, the obvious question is: how do you prevent the collector from collecting this object? There are a few ways:

  1. The obvious answer is to return o instead of ref.get() (this was harder in the original code, which I've simplified to make my point).

  2. Another way is way is to make o reachable from a static field. That makes the object rather obviously reachable.

  3. A third way is to give the object a finalizer, and do synchronization in that finalizer, and then do synchronization on that
    object when you want the object to stay alive.

That last one probably requires a few more words. We wrote that into the spec so that the Finalizer Guardian idiom would work. The Finalizer Guardian idiom is described in Josh Bloch's invaluable book Effective Java (lgt amazon. Seriously, if you don't have a copy, go get one).

Anyway, we put a nice little exegesis on the link between the Finalizer Guardian and this rule into the language spec, so just go ahead and read it there.

Having said all of that, I will agree that this is a massive pain. It would be nice if keeping objects reachable-and-not-collected were a little easier. Doug Lea has proposed a method in his new Fences API called reachabilityFence. This method would keep an object reachable (in a garbage collection sense) until execution reached that point in the code. I'm on the fence about the Fences API, but I think that this particular method (which he alternatively called keepAlive) would be a useful tool.