A little while ago, I got asked a question about when an object is allowed to be collected. It turns out that objects can be collected sooner than you think. In this entry, I'll talk a little about that.
When we were formulating the memory model, this question came up with finalizers. Finalizers run in separate threads (usually they run in a dedicated finalizer thread). As a result, we had to worry about memory model effects. The basic question we had to answer was, what writes are the finalizers guaranteed to see? (If that doesn't sound like an interesting question, you should either go read my blog entry on volatiles or admit to yourself that this is not a blog in which you have much interest).
Let's start with a mini-puzzler. A brief digression: I'm calling it a mini-puzzler because in general, for puzzlers, if you actually run them, you will get weird behavior. In this case, you probably won't see the weird behavior. But the weird behavior is perfectly legal Java behavior. That's the problem with multithreading and the memory model — you really never know what the results of a program will be from doing something as distasteful as actually running it.
Anyway, suppose you have a class that looks like this:
And then you use this class, thus:
Let's say that by some miracle the finalizer actually runs (Rule 1 of why you don't use finalizers: they are not guaranteed to run in a timely fashion, or, in fact, at all). What do you think the program is guaranteed to print?
Those of you who are used to reading these entries will realize immediately that unless they actually already know the answer, they have no idea. Let's try to reason it out, then.
First, we notice that the object reference fo is live on the stack when all three variables are set. So, the object shouldn't get garbage collected, right? The finalizer should print out 1 2 3, yes?
Would I have asked if that were the answer?
It turns out that the VM, as usual, is going to play some tricks here. In the words of the JLS, optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. What this means is that the VM is going to make your object garbage sooner than you think.
The VM can do a few things to effect this (yes, this is the correct spelling of effect). First, it can notice that the
object is never used after the call to setJ, and null out the reference to fo immediately after that. It's reasonably clear that if the finalizer ran immediately after that, you would see 1 2 0.
That's not the end of it, though. The VM can notice that:
it that write altogether. Woosh! You get 1 0 0.
At this point, you are probably expecting me to say that you can also get 0 0 0, because the programmer isn't actually using the write to i, either. As a matter of fact, I'm not going to say that. It turns out that the end of an object's constructor happens-before the execution of its finalize method. In practice, what this means is that any writes that occur in the constructor must be finished and visible to any reads of the same variable in the finalizer, just as if those variables were volatile. This paragraph originally read incorrectly
The immediate question is, how does the programmer avoid this insanity? The answer is: don't use finalization!
Okay, that's not enough of an answer. Sometimes you need to use finalization. There's a hint several paragraphs up. The finalizer takes place in a separate thread. It turns out that what you need to do is — exactly what you would do to make the code thread-safe. Let's do that, and look at the code again.
And then you use this class, thus:
The finalizer is now guaranteed not to execute until all of the fields are set. When that sucker runs, you will see 1 2 3.
Oddly, I've been writing for almost an hour and I haven't gotten to my coworker's question yet. In the interests of brevity, I'll make this a series. More later.
When we were formulating the memory model, this question came up with finalizers. Finalizers run in separate threads (usually they run in a dedicated finalizer thread). As a result, we had to worry about memory model effects. The basic question we had to answer was, what writes are the finalizers guaranteed to see? (If that doesn't sound like an interesting question, you should either go read my blog entry on volatiles or admit to yourself that this is not a blog in which you have much interest).
Let's start with a mini-puzzler. A brief digression: I'm calling it a mini-puzzler because in general, for puzzlers, if you actually run them, you will get weird behavior. In this case, you probably won't see the weird behavior. But the weird behavior is perfectly legal Java behavior. That's the problem with multithreading and the memory model — you really never know what the results of a program will be from doing something as distasteful as actually running it.
Anyway, suppose you have a class that looks like this:
class FinalizableObject {
int i; // set in the constructor
int j; // set by the setter, below
static int k; // set by direct access
public FinalizableObject(int i) {
this.i = i;
}
public void setJ(int j) {
this.j = j;
}
public void finalize() {
System.out.println(i + " " + j + " " + k);
}
}
And then you use this class, thus:
void f() {
FinalizableObject fo = new FinalizableObject(1);
fo.setJ(2);
FinalizableObject.k = 3;
}
Let's say that by some miracle the finalizer actually runs (Rule 1 of why you don't use finalizers: they are not guaranteed to run in a timely fashion, or, in fact, at all). What do you think the program is guaranteed to print?
Those of you who are used to reading these entries will realize immediately that unless they actually already know the answer, they have no idea. Let's try to reason it out, then.
First, we notice that the object reference fo is live on the stack when all three variables are set. So, the object shouldn't get garbage collected, right? The finalizer should print out 1 2 3, yes?
Would I have asked if that were the answer?
It turns out that the VM, as usual, is going to play some tricks here. In the words of the JLS, optimizing transformations of a program can be designed that reduce the number of objects that are reachable to be less than those which would naively be considered reachable. What this means is that the VM is going to make your object garbage sooner than you think.
The VM can do a few things to effect this (yes, this is the correct spelling of effect). First, it can notice that the
object is never used after the call to setJ, and null out the reference to fo immediately after that. It's reasonably clear that if the finalizer ran immediately after that, you would see 1 2 0.
That's not the end of it, though. The VM can notice that:
- This thread isn't using the value written by that write to j, and
- There is no evidence that synchronization will make this write visible to another thread.
At this point, you are probably expecting me to say that you can also get 0 0 0, because the programmer isn't actually using the write to i, either. As a matter of fact, I'm not going to say that. It turns out that the end of an object's constructor happens-before the execution of its finalize method. In practice, what this means is that any writes that occur in the constructor must be finished and visible to any reads of the same variable in the finalizer, just as if those variables were volatile. This paragraph originally read incorrectly
The immediate question is, how does the programmer avoid this insanity? The answer is: don't use finalization!
Okay, that's not enough of an answer. Sometimes you need to use finalization. There's a hint several paragraphs up. The finalizer takes place in a separate thread. It turns out that what you need to do is — exactly what you would do to make the code thread-safe. Let's do that, and look at the code again.
class FinalizableObject {
static final Object lockObject = new Object();
int i; // set in the constructor
int j; // set by the setter, below
static int k; // set by direct access
public FinalizableObject(int i) {
this.i = i;
}
public void setJ(int j) {
this.j = j;
}
public void finalize() {
synchronized (lockObject) {
System.out.println(i + " " + j + " " + k);
}
}
}
And then you use this class, thus:
void f() {
synchronized (lockObject) {
FinalizableObject fo = new FinalizableObject(1);
fo.setJ(2);
FinalizableObject.k = 3;
}
}
The finalizer is now guaranteed not to execute until all of the fields are set. When that sucker runs, you will see 1 2 3.
Oddly, I've been writing for almost an hour and I haven't gotten to my coworker's question yet. In the interests of brevity, I'll make this a series. More later.
Comments
And are finalizers actually EVER used? (as in, used in new code)
Also, people still use finalizers when they open native resources; the finalizer will do emergency cleanup of, say, file descriptors. Hopefully, most people know well enough to avoid them - this post is just leading up to a discussion of how this stuff works with SoftReferences.
Actually that is exactly what I expected you to say. And not due to unused write to "i", but because a finalizers are run in separate thread and while field "i" is not volatile it is possible for other threads to see old i values (which is 0). Am I wrong and it is impossible to get 0 0 0 as output?
Is the bold word right here? I'd guess it should mean constructor from the context.
Or should there really be an order within the finalizer?
Apart from this a very interesting read, again. Thanks. :-)
The best reason not to write finalizers (in my opinion) is that they are not guaranteed to be run.
The VM can notice that you aren't actually using that write to j. It can then just eliminate it. Woosh! You get 1 0 0.
I can't see that.
what's eliminated? the write to j or the reference to fo?
if the write to j, that it means that any write not synchronized could be eliminated if not used in that thread?
if fo, that it means that object variables could be written even when the object it should be synchronized no longer exists?
of course j value could be 0 because VM no need to propagate the value from one thread to another
To achieve this effect, the JVM might be injecting code that writes to a volatile variable as the last statement in the constructor. The finalizer process will have to make sure to read this injected variable before the finalizer is run on this object's instance. And this injection should only happen if the object has a finalize() method. right ?
Do you imply that, as soon as a write happens inside a synchronized block, it can't be optimized away ?
Doesn't it imply in turn that even without a sync block in the finalizer, the sync block in f() is enough to garantee "1 2" as you told that "1 0" was the result of the write being optimized away ?
Or is the finalizer's sync still necessary to garantee the "flush" of previous syncs ?
synchronized(new Object()) {
x = 1;
}
The system can determine that lock on the new Object() will never be acquired by another thread, and remove the lock acquisition and release entirely.
The point is that you need both ends of the happens-before relationship to guarantee visibility - the reader needs to use synchronization, and the writer needs to use synchronization. I've written a number of other blog entries on this subject.