NB: I started writing this last August, and sent it to a friend of mine for thoughts. I just realized that he never got back to me on it, and I never posted it. I don't much feel like editing it now, so here it is in its original glory
Okay, I know. It is nearly impossible to read and understand the causality bits of the Java memory model. That's all that stuff about "Executions and Causality Requirements" on pages 568 – 570 of the Java Language Specification. Bill Pugh, Sarita Adve and I wrote it. Basically, in the last three years, there have been three kinds of reactions to it:
I find that the last group usually pretty much understands it, with a few minor clarifications. They are also overwhelmingly graduate students, for what it's worth. They have found a few flaws in it, which is also nice.
(Wait, I hear you ask, nice? Alas, I must admit that I never thought the JMM would be bulletproof. It was no more likely to be bulletproof than version 1.0 of a major software release. The main goal was to create a memory model which described the problem space completely, was practical for a real programming language and was rigorous (so that it could be completely understood with sufficient effort). The fact that it was possible to find a few flaws in specific cases (rather than by saying "this is a complete mess", which used to be the case) means that people could understand it and reason about it, and that therefore the mission was accomplished. No one had ever done that for a real language before, and I'm proud of the work we did.)
That still leaves us with the original problem, which is that it is really, really hard to understand. What I'm going to do in this post is to try to explain the key intuition behind it very briefly. If it still doesn't make sense, then tell me about it, and I'll try to clean it up.
(The prerequisite to understanding what I am about to say is an understanding of the happens-before order. If you understand that, and you understand that the lack of that in your program can lead to weird effects, then the rest of the post may interest you. If you don't understand that, this is not the post for you.)
There are two basic questions that we need to address to understand the memory model:
Why is the happens-before relationship not sufficient?
The main guarantee we want to give programmers is something called "Sequential Consistency for Data Race Free programs" (SC for DRF, for short). What this means, in brief, is that if your program is free from data races, then you won't see any weird effects. If you are still reading, you know that a data race is two accesses, at least one of which is a write, to the same variable, which are not ordered by happens-before. Now, consider the example:
If all of this program's statements were to execute, there would be happens-before relationships between 0 and 1, 1 and 2, 0 and 3, 3 and 4. But there would be no happens-before relationships between Thread 1 and Thread 2.
Now, an intuitive model, built entirely around happens-before, basically says that if you are trying to figure out the value that can be seen by a read, you can either look at the last write to a given variable via happens-before, or a write that is in a data race with the read. That would mean that the read at 3 could see the write at 2, and the read at 1 can see the write at 4, because those writes are in a data race with those reads.
We can now say that the read of x at 1 sees 1, and the read of y sees 1, and have that be legal under the happens-before model.
Do you see what happened there? The read of x at 1 saw the value 1, then wrote 1 to y at 2. The read of y at 3 saw the value 1, and wrote 1 to x at 4. That justified the read of x at 1 seeing the value 1.
We can't allow this, because it violates the SC for DRF property. A program is data race free if, under all SC executions -- those are the executions without any weird reorderings -- there are no data races. There are no data races under any SC execution of this program -- the reads will both return 0, so the writes won't occur, so there is no data shared between the two threads. Therefore, this program is data race free, and it has to have only SC executions. The execution I just described -- where the writes occur -- is not a SC execution.
Since the happens-before model allows this example, it doesn't enforce the SC for DRF property. We therefore had to make some additional guarantees.
Um... okay. What are the additional guarantees?
Now that we have decided that there are actions that are not allowed to happen --in the above example, the writes are not allowed to occur -- the trick is to figure out when an action is allowed to occur, and when it is not.
Now, if you are writing code, and want to figure out if an action can occur, what do you do? Well, you look at all of the actions that came before it. So let's look at the actions that happen-before the action you want to justify, and determine whether those actions justify it.
In the above example, we look at instruction 2, and then look at the actions that happen-before it, which are 1 and 0. 1 and 0 don't justify 2 happening, so it is not allowed to happen.
Let's try this on a different example. This is a good one:
The interesting result here is r1 == r2 == 1, which is possible in a real execution if Thread 1's actions are reordered by the compiler. Basically, the write to y occurs, then Thread 2 reads the 1 from y and writes it to x, and then Thread 1 reads that value for x.
The first instruction we want to execute is the one labeled "2". If you look at the actions that happen-before 2 (again, 1 and 0), they will always lead into 2 being executed. So 2 is allowed to happen.
That's well and good for action 2, but the interesting result requires that the value 1 be read from y in instruction 3. This isn't justifiable based on what happens-before 3 -- if we are only looking at what happens-before 3, the only value that can be read from y is 0.
So we need some way of justifying the read of y in 3 returning the value 1. It can only do this if something wrote the value 1 to y. But wait -- we've already said that it was justifiable to write 1 to y! So if, in addition to being able to justify actions based on the actions that happen-before them, we also say that it is possible to justify actions based on actions you have already justified, then we have now allowed our result!
So there is the intuition behind the model. When you want to justify the fact that a read returns the value of a write, you can do so if:
a) That write happened before it, or
b) You have already justified the write.
The way that the model works is that you start with an execution that has all of the actions justified by property a), and you start justifying actions iteratively. So, in the second example above, you create successive executions that justify 0, then 2, then 3, then 4, then 1.
This definition has the lovely property of being reasonably intuitive and also guaranteeing SC for DRF programs.
That's the most complicated bit of the model (perhaps excluding the stuff about observable actions and infinite executions, but they are much less important).
I hope that helps.
Okay, I know. It is nearly impossible to read and understand the causality bits of the Java memory model. That's all that stuff about "Executions and Causality Requirements" on pages 568 – 570 of the Java Language Specification. Bill Pugh, Sarita Adve and I wrote it. Basically, in the last three years, there have been three kinds of reactions to it:
- ZOMGWTF!? (I'm a normal human being. Give me a break.)
- This is incomprehensible gibberish. (I'm a pretty smart cookie and I think that everything I can't understand in three seconds is worthless.)
- Okay, I think I get it, but I have some questions. (I'm a pretty smart cookie and I spent several days trying to understand it.)
I find that the last group usually pretty much understands it, with a few minor clarifications. They are also overwhelmingly graduate students, for what it's worth. They have found a few flaws in it, which is also nice.
(Wait, I hear you ask, nice? Alas, I must admit that I never thought the JMM would be bulletproof. It was no more likely to be bulletproof than version 1.0 of a major software release. The main goal was to create a memory model which described the problem space completely, was practical for a real programming language and was rigorous (so that it could be completely understood with sufficient effort). The fact that it was possible to find a few flaws in specific cases (rather than by saying "this is a complete mess", which used to be the case) means that people could understand it and reason about it, and that therefore the mission was accomplished. No one had ever done that for a real language before, and I'm proud of the work we did.)
That still leaves us with the original problem, which is that it is really, really hard to understand. What I'm going to do in this post is to try to explain the key intuition behind it very briefly. If it still doesn't make sense, then tell me about it, and I'll try to clean it up.
(The prerequisite to understanding what I am about to say is an understanding of the happens-before order. If you understand that, and you understand that the lack of that in your program can lead to weird effects, then the rest of the post may interest you. If you don't understand that, this is not the post for you.)
There are two basic questions that we need to address to understand the memory model:
Why is the happens-before relationship not sufficient?
The main guarantee we want to give programmers is something called "Sequential Consistency for Data Race Free programs" (SC for DRF, for short). What this means, in brief, is that if your program is free from data races, then you won't see any weird effects. If you are still reading, you know that a data race is two accesses, at least one of which is a write, to the same variable, which are not ordered by happens-before. Now, consider the example:
Initially:
0: x == y == 0
Thread 1:
1: r1 = x;
if (r1 == 1)
2: y = 1;
Thread 2:
3: r2 = y;
if (r2 == 1)
4: x = 1;
If all of this program's statements were to execute, there would be happens-before relationships between 0 and 1, 1 and 2, 0 and 3, 3 and 4. But there would be no happens-before relationships between Thread 1 and Thread 2.
Now, an intuitive model, built entirely around happens-before, basically says that if you are trying to figure out the value that can be seen by a read, you can either look at the last write to a given variable via happens-before, or a write that is in a data race with the read. That would mean that the read at 3 could see the write at 2, and the read at 1 can see the write at 4, because those writes are in a data race with those reads.
We can now say that the read of x at 1 sees 1, and the read of y sees 1, and have that be legal under the happens-before model.
Do you see what happened there? The read of x at 1 saw the value 1, then wrote 1 to y at 2. The read of y at 3 saw the value 1, and wrote 1 to x at 4. That justified the read of x at 1 seeing the value 1.
We can't allow this, because it violates the SC for DRF property. A program is data race free if, under all SC executions -- those are the executions without any weird reorderings -- there are no data races. There are no data races under any SC execution of this program -- the reads will both return 0, so the writes won't occur, so there is no data shared between the two threads. Therefore, this program is data race free, and it has to have only SC executions. The execution I just described -- where the writes occur -- is not a SC execution.
Since the happens-before model allows this example, it doesn't enforce the SC for DRF property. We therefore had to make some additional guarantees.
Um... okay. What are the additional guarantees?
Now that we have decided that there are actions that are not allowed to happen --in the above example, the writes are not allowed to occur -- the trick is to figure out when an action is allowed to occur, and when it is not.
Now, if you are writing code, and want to figure out if an action can occur, what do you do? Well, you look at all of the actions that came before it. So let's look at the actions that happen-before the action you want to justify, and determine whether those actions justify it.
In the above example, we look at instruction 2, and then look at the actions that happen-before it, which are 1 and 0. 1 and 0 don't justify 2 happening, so it is not allowed to happen.
Let's try this on a different example. This is a good one:
0: x == y == 0
Thread 1:
1: r1 = x;
2: y = 1;
Thread 2:
3: r2 = y;
4: x = r2;
The interesting result here is r1 == r2 == 1, which is possible in a real execution if Thread 1's actions are reordered by the compiler. Basically, the write to y occurs, then Thread 2 reads the 1 from y and writes it to x, and then Thread 1 reads that value for x.
The first instruction we want to execute is the one labeled "2". If you look at the actions that happen-before 2 (again, 1 and 0), they will always lead into 2 being executed. So 2 is allowed to happen.
That's well and good for action 2, but the interesting result requires that the value 1 be read from y in instruction 3. This isn't justifiable based on what happens-before 3 -- if we are only looking at what happens-before 3, the only value that can be read from y is 0.
So we need some way of justifying the read of y in 3 returning the value 1. It can only do this if something wrote the value 1 to y. But wait -- we've already said that it was justifiable to write 1 to y! So if, in addition to being able to justify actions based on the actions that happen-before them, we also say that it is possible to justify actions based on actions you have already justified, then we have now allowed our result!
So there is the intuition behind the model. When you want to justify the fact that a read returns the value of a write, you can do so if:
a) That write happened before it, or
b) You have already justified the write.
The way that the model works is that you start with an execution that has all of the actions justified by property a), and you start justifying actions iteratively. So, in the second example above, you create successive executions that justify 0, then 2, then 3, then 4, then 1.
This definition has the lovely property of being reasonably intuitive and also guaranteeing SC for DRF programs.
That's the most complicated bit of the model (perhaps excluding the stuff about observable actions and infinite executions, but they are much less important).
I hope that helps.
Comments
I'm currently working on the data caches for a Java multiprocessor. We intend to use a fairly simple consistency model, which basically implements lazy release consistency. Memory accesses are never reordered on the issuing side in our implementation and our caches are write-through, but caches are not held consistent with main memory unless a monitorenter or a read from a volatile variable occurs. While it isn't hard to see that this follows the rules of happens-before (thread and finalizer issues put aside), I have no idea how to justify that the causality requirements of the JMM aren't violated. Any idea how to check if an implementation is compliant with this part of the JMM spec?
A simple rule of thumb for the causality requirements is that they are basically going to be met if you don't perform speculative writes. It sounds as if you don't, so you should be fine. Without knowing all the details, it is harder to get more concrete.
I really appreciate if you give me a hint.
In order to commit a given set C_i - C_i-1, you need to justify it with a complete execution of the program in which all of the actions in C_i take place. The set of all actions in the ith execution is A_i.
So, A_i is the complete ith execution, A_n is the final execution, and C_i is the part of A_i that is also in A_n.
Does that make sense?
The root cause is that Java gives semantics to programs with data races, C++ does not. This leads to a key difference in the definition of which writes a read is allowed to observe. For Java, Sec. 3 of [Manson et al., 05] says that for a read $r$ to be allowed to observe a write $w$, $r$ cannot happen-before $w$.
For C++, Sec. 6 of [Boehm&Adve08] says that for $w$ to be a visible side effect for $r$, $w$ must happen before $r$. That makes the trick!
Sec 4. of Manson et al. shows that the happens-before memory model does not entail sequential consistency for data-race-free program by exhibiting an example; this example however relies crucially on reads seeing the effect of writes without the happens-before relationship ensuring any order.
Does it make sense to you?
[Boehm&Adve08], Hans-J. Boehm, Sarita V. Adve, Foundations of the C++ memory model, PLDI '08
1. What does the "execution order" represent as it is used in the Java Language specs?
To be more precise:
Trace 17.5 has two possible such execution orders listed on the following page. Based on those one can rule out that execution order must be consistent with program order (also, why are those numbers followed by colons in front of the actions out of order - 1,3,2,4?). It is also not the "real life" execution order since the second one has reads that see later writes and, literally speaking, a read cannot possibly see the write before it takes place, right?
My humble intuition tells me that a valid execution order for Trace 17.5 to finish with r1=1,r2=2 is:
B = 1;
A = 2;
r2 = A;
r1 = B. No reorderings or "read later writes" required.
2. What do you mean by "happens-before order allows a read to see a later write"? From which perspective is it "later"? Since it is not actually executed later (quote: "bearing in mind that if a read sees a write that occurs later in the execution, it represents
the fact that the write is actually performed early") nor is happens-before defined for such a pair of read-write that happens in two separate threads.
Similarly, the notion of "later write" is an informal one. This is
why it is in the "discussion" section, rather than the "spec" section.
So causality in the context of JMM means achieving perfect SC using hb? or I am wrong.
Please help me understand the causality term here in JLS.
Is there a proof of that for a general case?
I can see that it works for the examples in this post, but it isn't obvious to me how the causality algorithm guarantees SC-DRF for a general case.
The proof is in Section 9.2.1 Correctly synchronized programs exhibit only sequentially consistent behaviors of the paper.