Friday, May 11, 2007

Roach Motels and The Java Memory Model

I've been asked by a couple of people lately about what a compiler can do with code motion around synchronized blocks. The first person asked me about the following example, so let's run with it:

x = 1;
synchronized(o) {
z = z + 1;
}
y = 1;

I may have answered this already (at some point), but I feel free to repeat myself.

The JMM has what we call "roach motel" ordering. "Roach Motel" is Black Flag's name for their cockroach trap; in grad school, my apartment had lots and lots and lots of cockroaches, so I find it particularly salient (although I think it was probably Bill Pugh who came up with this association).

In the 1980s, the slogan for Black Flag Roach Motels was "Roaches check in, but they don't check out". The memory model is the same: accesses can be moved into synchronized blocks, but they can't be moved out. So, in this case, you can get:

x = 1;
synchronized(o) {
z = z + 1;
}
y = 1;

if we want to move the write to y early, this code transforms to:

x = 1;
synchronized(o) {
z = z + 1;
y = 1; // code motion
}

and transforms to:

x = 1;
synchronized(o) {
y = 1; // code motion
z = z + 1;
}

and can't be transformed any more, because the write to y has been moved into the
synchronized block, but can't move out.

Similarly, if we want to move the write to x down:

x = 1;
synchronized(o) {
z = z + 1;
}
y = 1;

transforms to:

synchronized(o) {
x = 1; // code motion
z = z + 1;
}
y = 1

transforms to:

synchronized(o) {
z = z + 1;
x = 1; // code motion
}
y = 1

.. but the write to x can't move any further.

At the end of these sorts of transformations, you can end up with something that looks like this:

synchronized(o){
y = 1;
z = z + 1;
x = 1;
}

... so the accesses to x, y and z can happen in any order, but once you move them into the synchronized block, you can't move them out again.

This is a tiny bit misleading, of course, because they can go out the direction they came in. But there is no difference between that and not having entered the synchronized block in the first place.

Now, if the compiler can remove the synchronized block altogether, then all the accesses can be moved about freely. This can happen, for example, if o is an object only accessed by a single thread -- the compiler will know that no other thread can lock on it, so it can feel free to remove the synchronization. That's a tale for another day, of course.

11 comments:

martti vh said...

Ok, this seems to be the right post to reply to about this. This Wikipedia article (see the last Java example) says that in Java 5 this is a working implementation of double-checked locking:

class Foo {
private Helper helper = null;
private boolean initialized = false;
public Helper getHelper() {
if (!initialized) {
synchronized(this) {
if (!initialized) {
helper = new Helper();
initialized = true;
}
}
}
return helper;
}

// other functions and members...
}

But can't the writes to 'helper' and 'initialized' be reordered in the synchronized block since 'initialized' is not volatile?

And how the hell can I keep the code formatting in my replies?! :)

Jeremy Manson said...

Gak! Thanks for pointing that out. I'll fix their wagons, but good. Grr..

(You can keep the formatting by using HTML. I tend to put   every place I want an indent)

Tushar Goel said...

Sorry to reply on old post.

But i think question asked by Merriti is right and i dint find the answer. Jeremy, What do you think about it?

Jeremy Manson said...

I don't understand the comment, Tushar. The Wikipedia article was fixed years ago. It still seems okay.

Tushar Goel said...

Sorry, i was confused. I dint check wikipedia page carefully. My bad. :(

Pushparaj said...

Does this apply even if x,y,z are declared volatile?
Another question: When memory is flushed from processor cache to main memory?
1. At each synchronized(object) statement entry and exit?
2. At each volatile read and write?
3. Does it flush full processor cache which includes data about all objects or just this object to which the volatile variable belongs or synchronized statement was applied

Jeremy Manson said...

@pushparaj

It doesn't apply broadly if x, y, and z are declared volatile.

All of your questions assume that there is some defined point where the memory has to be flushed. The semantics don't specify such a point. The trick is to understand the semantics.

The key thing to understand is the happens-before relationship. Reads generally have to return the last write that happened before them (with lots of caveats). Every action in a thread happens-before actions that happen later in the thread. Synchronization can also imply happens-before ordering. For example, a write to a volatile variable happens-before a read of the same volatile variable.

One way of implementing this is to flush everything to memory when there is a volatile write. Another way would be to figure out what you wrote since the last time you did a flush, and just flush that.

There are even ways in which there might be *no* flushes. If a volatile variable is confined to a single thread, it won't set up any happens-before relationships with other threads, so you just don't need the flushes at all.

That said, if you are looking for implementation strategies, then the JSR-133 cookbook document is useful.

Pushparaj said...

@Jermy

But it should flush all writes which are transitive in relation(happens-before) to write of volatile variable assuming some other thread reads this volatile variable.

For ex:
y is volatile but not x
x=10;
y=30;

another thread which reads y should see x as 10;

Jeremy Manson said...

Again, don't mix up happens-before with the notion of a flush. Consider this code:

Thread 1:
x = 10;
y = 30;

Thread 2:
r1 = y
// thread termination

The compiler could decide not even to execute that read. And then figure out that there are no reads after it for which you need to guarantee happens-before. And then decide that the write to y is thread-local, and not perform any flushing.

As long as the semantics are faithfully observed, there may be a mflush, there may not be a mflush.

That said, a volatile write followed by a volatile read of the same variable will produce a happens-before relationship, and provide the kinds of visibility guarantees you are asking about.

Pushparaj said...

Does sequential consistency is achieved by happens-before semantics alone or along some other constructs in JMM

Jeremy Manson said...

@pushparaj

Sequential consistency is only guaranteed for volatile field accesses. It is not guaranteed by happens-before semantics.