Skip to main content

Loop Invariant Code Motion and the Java Memory Model

I gave a talk on the Java memory model this afternoon, and there was a question from the audience which I didn't answer to my own satisfaction, so I am answering it again, for my vast readership of none.

At issue was the following bit of code:

class Animator implements Runnable {
 private volatile boolean stop = false;
 public void stop() { stop = true; }
 public void run() {
  while (!stop) {
   oneStep();
   try { Thread.sleep(100); } …;
  }
 }
 private void oneStep() { /*...*/ }
}

This is a pretty common idiom. One thread calls run(), and another thread (eventually) calls stop(), which sets the stop flag to true, and halts the loop. This is fairly intuitive.

What is not intuitive about this code is that if the variable is not declared volatile, then regardless of whether another thread calls stop(), this code is not guaranteed to terminate. A compiler (in this case, likely Java's JIT compiler) can:
  1. Look at the run() method,
  2. Notice that stop is never updated, and therefore
  3. Hoist the read of stop outside the loop.
The resulting code would look something like this:

public void run() {
 if (!stop) {
  while (true) {
   oneStep();
   try { Thread.sleep(100); } …;
  }
 }
}

... which would likely run forever. But only if stop is not declared volatile.

Now, the questioner asked if I knew of any compilers that were likely to do this. Pretty much every modern compiler does loop invariant code motion -- basically, if it decides that some program instruction will evaluate the same on every loop iteration (as the read of stop does), then it will arrange the code so that program instruction is only executed once, before the loop.

What I said was that I didn't know if any particular compiler did this. What I didn't mention was that this is because, in this case, the compiler has a little extra work to do. It has to determine that the actions in the onestep() method don't affect the value of stop, and it also has to make the decision to do loop invariant code motion for the loop guard (there is no reason not to do code motion for the loop guard).

You are actually pretty likely to run into this sort of thing in practice. Whether you will on this particular example is a question for debate.

I feel a little better having gotten that off my chest.

Comments

Felipe Coury said…
Jeremy,

One question - consider the following (changed) snippet:

public void run() {
while (true) {
if (stop) {
return;
}
oneStep();
try { Thread.sleep(100); } …;
}
}

I agree that using stop as volatile even in this situation wouldn't hurt but I'd like to ask anyway: would this work even if stop is not volatile or even this code could eventually get optimized by the compiler?
Jeremy Manson said…
This is the same code, more or less, so the exact same argument applies. The stop variable is loop-invariant, so it can be hoisted out of the loop.

The take away message here is that if you are going to communicate values between threads, you have to use some explicit form of communication, whether it be in the form of a synchronized block, a volatile field modifier, or something else entirely.
pveentjer said…
Hi Jeremy,

Next to the reordering problem the code is also subject to visibility problems (when a normal variable is used instead of a volatile one). The 'victim' thread doesn't need to see the most recently written value, and therefore could keep seeing the initial value which effectively changes the loop to an infinitive one.

ps: I add this comment to make this post more complete, I know you understand the JMM.
Jeremy Manson said…
Peter -- this is absolutely true. The real problem here from the point of view of Java semantics is the lack of a happens-before relationship, which I am sure is defined in another blog entry somewhere.
Anonymous said…
Hey Jeremy,

JMM ensures that code within a synchronized block will not be reordered but is it true for code within a try block also?
pveentjer said…
If instructions can't cause exceptions like a simple assignment, I don't see the problem of moving these instructions from above to inside and vice versa, as long it doesn't violate the within-thread as-if-serial semantics of the code.
Jeremy Manson said…
Mostly what Peter said. If you have:

class Foo {
Object o = new Object();
void Thread1() {
o = null;
}
void Thread2() {
if (o == null) return;
try {
int x = o.hashCode();
} catch (NullPointerException e) {
}
}
}

You still have to catch the NPE if Thread1 sets o to null after the if statement executes.
Anonymous said…
Thank you guys.
Anonymous said…
So what this means is

try {
int a = 0;
int b = 1;
} catch(Exception e) {
}

The compiler is free to reorder the above.

There are chances that one thread might see the above as
int a = 0; int b = 1;

whereas another might see it as
int b = 1; int a = 0;
pveentjer said…
That is correct.

ps: if the variables are local and not instance variables, other threads are not able to see the variables. So your example is not going to lead to reordering/visibility problems.
Anonymous said…
Thanks. Had this doubt lingering in my mind for sometime.
Anonymous said…
Well said.
Unknown said…
Perhaps you ve answered this in some other blog but I m trying to figure out what barriers are involved with volatile load/stores:

T0:
store(volatile_var, 1)
load(volatile_var)
.....
....
load(volatile_var)

T1:
store(volatile_var, 30)

Assuming the way the interleaving goes as follows: T1's last store happens before T0's last load, I would assume either the volatile variable is not stored in WB memory or if it is something invalidates it before doing a load else it will get 1 no? The JSR cookbook seems to imply we would use a LoadLoad Barrier before the last volatile load which IIRC is lfence on x86 and that is a noop for WB memory? So volatile is never placed in WB memory?
Jeremy Manson said…
@bank kus - LoadLoad barriers on x86 are a no-op because the memory consistency barriers make them unnecessary, not because that level of consistency is unavailable. Specifically, every two instructions on x86 behave as if there is an intervening LoadLoad barrier. The JSR-133 cookbook is right - basically, no explicit coherence operations are needed for volatile reads, and a StoreLoad is needed after volatile writes.

Popular posts from this blog

Double Checked Locking

I still get a lot of questions about whether double-checked locking works in Java, and I should probably post something to clear it up. And I'll plug Josh Bloch's new book, too. Double Checked Locking is this idiom: // Broken -- Do Not Use! class Foo {   private Helper helper = null;   public Helper getHelper() {     if (helper == null) {       synchronized(this) {         if (helper == null) {           helper = new Helper();         }       }     }   return helper; } The point of this code is to avoid synchronization when the object has already been constructed. This code doesn't work in Java. The basic principle is that compiler transformations (this includes the JIT, which is the optimizer that the JVM uses) can change the code around so that the code in the Helper constructor occurs after the write to the helper variable. If it does this, then after the constructing thread writes to helper, but before it actually finishes constructing the object,

What Volatile Means in Java

Today, I'm going to talk about what volatile means in Java. I've sort-of covered this in other posts, such as my posting on the ++ operator , my post on double-checked locking and the like, but I've never really addressed it directly. First, you have to understand a little something about the Java memory model. I've struggled a bit over the years to explain it briefly and well. As of today, the best way I can think of to describe it is if you imagine it this way: Each thread in Java takes place in a separate memory space (this is clearly untrue, so bear with me on this one). You need to use special mechanisms to guarantee that communication happens between these threads, as you would on a message passing system. Memory writes that happen in one thread can "leak through" and be seen by another thread, but this is by no means guaranteed. Without explicit communication, you can't guarantee which writes get seen by other threads, or even the order in whic

Date-Race-Ful Lazy Initialization for Performance

I was asked a question about benign data races in Java this week, so I thought I would take the opportunity to discuss one of the (only) approved patterns for benign races. So, at the risk of encouraging bad behavior (don't use data races in your code!), I will discuss the canonical example of "benign races for performance improvement". Also, I'll put in another plug for Josh Bloch's new revision of Effective Java (lgt amazon) , which I continue to recommend. As a reminder, basically, a data race is when you have one (or more) writes, and potentially some reads; they are all to the same memory location; they can happen at the same time; and that there is nothing in the program to prevent it. This is different from a race condition , which is when you just don't know the order in which two actions are going to occur. I've put more discussion of what a data race actually is at the bottom of this post. A lot of people think that it is okay to have a data