Skip to main content

Garbage Collection, [Soft]References, Finalizers and the Memory Model (Part 2)

In which I ramble on at greater length about finalizers and multithreading, as well as applying that discussion to SoftReferences. Read Part 1 first. It explains why finalizers are more multithreaded than you think, and why they, like all multithreaded code, aren't necessarily going to do what you think they should do.

I mentioned in the last entry that a co-worker asked me a question about when objects are allowed to be collected. My questioner had been arguing with his colleagues about the following example:

Object f() {
Object o = new Object();
SoftReference<Object> ref = new SoftReference<Object>(o);
return ref.get();
}
He wanted to know whether f() was guaranteed to return o, or whether the new Object could be collected before f() returned. The principle he was operating under was that because the object reference was still on the stack, the object would not be collected. Those of you who read the last entry will know better, of course. To put it in the words of the Java Language Specification, optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable.

In practice, what this means is that the VM can:
  1. Decide the variable o is no longer used after the last time it appears in the program,

  2. Clear the SoftReference immediately after it is constructed, and

  3. Return null

Those of you who read my entry on how Hotspot decides to clear SoftReferences will know that this is extremely unlikely in Hotspot, where a SoftReference always survives for at least one collection. But in other VMs? I have no idea.

Anyway, the obvious question is: how do you prevent the collector from collecting this object? There are a few ways:

  1. The obvious answer is to return o instead of ref.get() (this was harder in the original code, which I've simplified to make my point).

  2. Another way is way is to make o reachable from a static field. That makes the object rather obviously reachable.

  3. A third way is to give the object a finalizer, and do synchronization in that finalizer, and then do synchronization on that
    object when you want the object to stay alive.

That last one probably requires a few more words. We wrote that into the spec so that the Finalizer Guardian idiom would work. The Finalizer Guardian idiom is described in Josh Bloch's invaluable book Effective Java (lgt amazon. Seriously, if you don't have a copy, go get one).

Anyway, we put a nice little exegesis on the link between the Finalizer Guardian and this rule into the language spec, so just go ahead and read it there.

Having said all of that, I will agree that this is a massive pain. It would be nice if keeping objects reachable-and-not-collected were a little easier. Doug Lea has proposed a method in his new Fences API called reachabilityFence. This method would keep an object reachable (in a garbage collection sense) until execution reached that point in the code. I'm on the fence about the Fences API, but I think that this particular method (which he alternatively called keepAlive) would be a useful tool.

Comments

Unknown said…
I didn't see anything in the JLS that prevents the compiler from optimizing away a finalizer guard that doesn't perform an explicit synchronization in its finalize method. Is that correct? If so, does Effective Java need a caveat? Since your previous post explained how most (all?) finalize() methods need to do some sort of synchronization in order to work properly, is this a non-issue?
Unknown said…
Maybe I am missing something fundamental here but just so I am clear is the example you gave specific to SoftReferences ? Or is this applicable to any class. For eg.


Object f() {
Object o = new Object();
Container ref = new Container(o);
return ref.returnObject();
}



Does the spec say that 'o' could potentially be garbage collected before this method returns ?

Or would this happen specifically for Soft,Weak references?
Jeremy Manson said…
@dhodge - correct, but in practice, this won't happen in existing VMs. I should mention it to Josh, though.

@Yousuf - If the container contains a reference to the object, it is considered reachable and therefore can't be garbage collected. If a SoftReference contains a reference to the object, that won't keep the object reachable.
Unknown said…
Great. Thanks.

Yousuf

Popular posts from this blog

Double Checked Locking

I still get a lot of questions about whether double-checked locking works in Java, and I should probably post something to clear it up. And I'll plug Josh Bloch's new book, too. Double Checked Locking is this idiom: // Broken -- Do Not Use! class Foo {   private Helper helper = null;   public Helper getHelper() {     if (helper == null) {       synchronized(this) {         if (helper == null) {           helper = new Helper();         }       }     }   return helper; } The point of this code is to avoid synchronization when the object has already been constructed. This code doesn't work in Java. The basic principle is that compiler transformations (this includes the JIT, which is the optimizer that the JVM uses) can change the code around so that the code in the Helper constructor occurs after the write to the helper variable. If it does this, then after the constructing thread writes to helper, but before it actually finishes constructing the object,

What Volatile Means in Java

Today, I'm going to talk about what volatile means in Java. I've sort-of covered this in other posts, such as my posting on the ++ operator , my post on double-checked locking and the like, but I've never really addressed it directly. First, you have to understand a little something about the Java memory model. I've struggled a bit over the years to explain it briefly and well. As of today, the best way I can think of to describe it is if you imagine it this way: Each thread in Java takes place in a separate memory space (this is clearly untrue, so bear with me on this one). You need to use special mechanisms to guarantee that communication happens between these threads, as you would on a message passing system. Memory writes that happen in one thread can "leak through" and be seen by another thread, but this is by no means guaranteed. Without explicit communication, you can't guarantee which writes get seen by other threads, or even the order in whic

Date-Race-Ful Lazy Initialization for Performance

I was asked a question about benign data races in Java this week, so I thought I would take the opportunity to discuss one of the (only) approved patterns for benign races. So, at the risk of encouraging bad behavior (don't use data races in your code!), I will discuss the canonical example of "benign races for performance improvement". Also, I'll put in another plug for Josh Bloch's new revision of Effective Java (lgt amazon) , which I continue to recommend. As a reminder, basically, a data race is when you have one (or more) writes, and potentially some reads; they are all to the same memory location; they can happen at the same time; and that there is nothing in the program to prevent it. This is different from a race condition , which is when you just don't know the order in which two actions are going to occur. I've put more discussion of what a data race actually is at the bottom of this post. A lot of people think that it is okay to have a data