Another question from a viewer.  We have this code:
(For the actual question as it was worded, the short answer is, "yes")
The question was one I get a lot. He asked how one thread can safely read the initial write of v to x. That is to say, if we have:
How can we change this code to guarantee that the read in Thread 2 will see the value 5 for x (assuming that ref is not null)?
Let's back up for a minute, and talk about why, in the code as written, the read in Thread 2 will not see the value 5 for x. Under the Java memory model, for one thread to be guaranteed to see an update made to a variable by another thread, there must be a happens-before relationship between the update and the read. This means that sometime after the write, there must be an action that performs what we call a "release", and sometime before the read, there must be an action that performs a corresponding "acquire".
The easiest way to get a release and a corresponding acquire is to use locking on the same variable. In this case, we would do something like this:
There is a release when the lock is released at the end of Thread 1, and an acquire when the lock is subsequently acquired at the beginning of Thread 2. This provides the guarantee that the updates that happen-before the release will be seen by the reads that happen-after the acquire.
For the really wonky -- what could happen? Well, we have
which is really:
There's no dependency between the second and third lines there, so we could easily have:
And another thread could read globalRef and see the reference to the object before x is assigned the value 5.
Next post -- why this explanation is bad, bad, bad! Any guesses?
class C {
  private int x;
  public C(int v) {
    this.x = v;
  }
  public synchronized int get() {
    return this.x;
  }
  public synchronized void set(int v) {
    this.x = v;
  }
}
(For the actual question as it was worded, the short answer is, "yes")
The question was one I get a lot. He asked how one thread can safely read the initial write of v to x. That is to say, if we have:
Thread 1:
globalRef = new C(5);
Thread 2:
C ref = globalRef;
if (ref != null) {
  int r1 = ref.x;
}
How can we change this code to guarantee that the read in Thread 2 will see the value 5 for x (assuming that ref is not null)?
Let's back up for a minute, and talk about why, in the code as written, the read in Thread 2 will not see the value 5 for x. Under the Java memory model, for one thread to be guaranteed to see an update made to a variable by another thread, there must be a happens-before relationship between the update and the read. This means that sometime after the write, there must be an action that performs what we call a "release", and sometime before the read, there must be an action that performs a corresponding "acquire".
The easiest way to get a release and a corresponding acquire is to use locking on the same variable. In this case, we would do something like this:
Thread 1:
synchronized (C.class) {
  globalRef = new C(5);
}
Thread 2:
synchronized (C.class) {
  C ref = globalRef;
  if (ref != null) {
    int r1 = ref.x;
  }
}
There is a release when the lock is released at the end of Thread 1, and an acquire when the lock is subsequently acquired at the beginning of Thread 2. This provides the guarantee that the updates that happen-before the release will be seen by the reads that happen-after the acquire.
For the really wonky -- what could happen? Well, we have
globalRef = new C(5);
which is really:
localRef = <allocate>;
localRef.x  = 5;
globalRef = localRef;
There's no dependency between the second and third lines there, so we could easily have:
localRef = <allocate>;
globalRef = localRef;
localRef.x  = 5;
And another thread could read globalRef and see the reference to the object before x is assigned the value 5.
Next post -- why this explanation is bad, bad, bad! Any guesses?
Comments
I can certainly see some potential follow-up topics. For example, in Thread 2, we are not protecting ref.x with a lock on ref; this is bad, because it is protected with a lock on this in the getter and setter methods. So, the obvious fix is something like:
synchronized (C.class) {
C ref = globalRef;
synchronized (ref) {
if (ref != null) {
int r1 = ref.x;
}
}
}
That's not great either, because then you are holding two locks, which can be a little deadlock-prone. Like I say, better solutions exist. More for another day, I think.
C ref;
synchronized (C.class) {
ref = globalRef;
}
if (ref != null) {
synchronized (ref) {
int r1 = ref.x;
}
}
correct?
But in the real world you would not do ref.x, but ref.get(). And if I understand correctly, even in that case you would still need the "synchronized (C.class)" in both Thread 1 and Thread 2 to ensure the write to x in the constructor is visible to Thread 2. Thus:
synchronized (C.class) {
C ref = globalRef;
}
if (ref != null) {
int r1 = ref.get();
}
should be safe. No?
What do you think?
Naveed: That would not work. Marking a field volatile doesn't prevent all reorderings. Specifically, you can move a write to a non-volatile variable to before a write to a volatile variable. In this case, that would allow a compiler to move the assignment to globalRef to before the assignment to x.
Having said that, marking the field final would work, and is actually the best solution.
[globalRef is volatile]
Thread 1:
// Expanding globalRef = new C(5);
localRef = [alloc] ;
localRef.x = 5;
globalRef = localRef;
Thread 2:
if (globalRef != null) {
int r1 = globalRef.x;
}
Assume that Thread 2 sees a non-null globalRef. This means that Thread 1 has performed a write on it. So there must be a happen-before edge from globalRef write in Thread 1 to globalRef read in Thread 2, and under new JMM, compilers are not allowed to bring non-volatile reads/writes inside this edge. I used this reasoning to say declaring globalRef as volatile can also work. Comments?
I agree that marking the field as final is a good solution; the only caveat being that we have to remove 'this.x = v' in 'set' method :-)
I think the pattern of setting "this.field" in the constructor and then providing synchronized access to the is common, yet folks don't take special measure to avoid this problem. Why isn't all the code in the world broken? Or is it?
If, on the other hand, you just randomly write it out to a field and expect other threads to read it correctly, then you are wrong.
Having said this, it is well-known among people who work with large-scale multi-processor and multicore designs that a large amount of the code out there is broken, and the errors simply don't show up until you scale up.
Assume I have a class
class Server {
Map data;
public Server() {
data = createMap();
new CustomThread(this).start();
}
}
I understand that if Server is to be subclass, then we can say that Server is improperly constructed because the subclass might add more initialize to its construction and some overridden methods which in combination might break the CustomThread (which invokes those methods).
However, if Server is declare as final so that it's not overridable, then can we consider it as being properly constructed (assuming CustomThead uses the passed 'this' for its own use and doesn't assign it to anywhere)?
Besides, since Thread#start() is one of the happens-before rules, I suppose the newly started Thread can also see up-to-date value of the data Map. Is this true?
Thanks a lot!
Buu
This is the part I don't understand. Maybe because I don't understand what the rules of safe construction are.
To be specific, if the Server class is not overridable, I don't see any difference between these 2 approaches, regardless of what the CustomThread actually does with the 'this' reference passed it its constructor.
1.
final class Server {
Server() {
data = createMap();
new CustomThread(this).start();
}
}
...
new Server();
2.
final class Server {
Server() {
data = createMap();
}
void start() {
new CustomThread(this).start();
}
}
...
new Server().start()
Thanks!
However, in this case, the guarantee is provided by the happens-before relationship.
Umm, did I miss something in my skimming of the comments, or has no one suggested the obvious solution, which is to call set from the constructor, i.e.
public C(int v) {
set(v);
}
?
You have now solved the problem, which is ensuring that all reads and writes (even ones done in the constructor) use the same lock.
I have some doubt regarding safe publication. This is an example from Java Concurrency in Practice Listing 3.8:
public class SafeListener {
private final EventListener listener;
private SafeListener() {
listener = new EventListener() {
public void onEvent(Event e) {
doSomething(e);
}
};
}
public static SafeListener newInstance(EventSource source) {
SafeListener safe = new SafeListener();
source.registerListener(safe.listener);
return safe;
}
}
Here I understand that the object is properly constructed as "this" doens't escape during construction because we don't publish the "listener" field in the constructor.
Question 1:
But why is this constructor made private? I mean even if its public the object is still properly constructed and there's no risk of unsafe publication.
Question 2:
What is the guarantee in the factory method that the "safe" is published only after the constructor returns? I mean why
"SafeListener safe = new SafeListener();" happens before "source.registerListener(safe.listener);" in the factory method.
Q2: Final fields are not guaranteed to be properly constructed until the constructor has finished (regardless of when they are set). Once the constructor is finished, you can safely publish the reference to the object that was constructed and assume that it was safe. I don't have JCiP in front of me, so I don't know why he is publishing safe.listener instead of safe (which seems safer). You might ask Brian.
Like you said "Once the constructor is finished, you can safely publish the reference to the object that was constructed and assume that it was safe." What I wanted to know is that is the constructor in "SafeListener safe=new SafeListener()" finished before "source.registerListener(safe.listener);" so that "safe" is properly published?
Okay consider this as you have given in an example earlier,
//Expanding safe=new SafeListener()
localRef=[alloc]
localRef.listener=....
safe=localRef;
Here if the compiler reorders the last two statements like:
safe=localRef;
localRef.listener=....
and before the initialization of second statement is completed, it registers safe in "source.registerListener(safe);"(as you said that its safer to publish "safe") then isn't "safe" improperly published?