Wednesday, June 17, 2009

Volatile Arrays in Java

I get asked a lot about how the volatile keyword interacts with arrays, so it is probably worth a blog post on the subject.

Those of you who have read my posts on volatile (Volatile Fields and Synchronization, Volatile Does Not Mean Atomic and, most importantly, What Volatile Means in Java) will have a pretty good idea of what volatile means, but it is probably worth it to provide a reminder.

Basically, if you write to a volatile field, and then you have a later read that sees that write, then the actions that happened before that write are guaranteed to be ordered before and visible to the actions that happen after the read. In practice, what this means is that the compiler and the processor can't do any sneaky reordering to move actions that come before the write to after it, or actions that come after the write to before it. See my post on What Volatile Means in Java for more detail.

With that out of the way, let's go through some examples of what you can do with volatile arrays:
volatile int [] arr = new int[SIZE];

arr = arr;
int x = arr[0];
arr[0] = 1;
The first lesson to learn, which will guide us here, is that arr is a volatile reference to an array, not a reference to an array of volatile elements! As a result, the write to arr[0] is not a volatile write. If you write to arr[0], you will get none of the ordering or visibility guarantees that you want from a volatile write.

What examples are there of a volatile write in the code above? Well, both of the writes to arr — the self-assignment and the write of new int[SIZE] — are volatile writes, because they are writing to arr, not one of its elements.

That explains where the volatile writes are. Where are the volatile reads in our example? It turns out that each of the lines after the declaration contains a volatile read:
arr = arr
This one is easy. The volatile read is on the right hand side of the assignment statement.
int x = arr[0];
This one is slightly more subtle. The volatile read is not the read of the array element. The right hand side of that assignment is a two step process. First, you read the array reference, then you read the 0th element of that array. The volatile read is the read of the array reference, not the read of the 0th element.
arr[0] = 1;
The previous example should give you a hint of where the volatile read is on this line. As in that example, the left-hand side is a two step process. First, you read the array reference, then you assign to the 0th element of that array. As odd as it seems, the read of the array reference is a volatile read.

The astute reader will notice that there is no actual way to get volatile write semantics by writing to an element of an array referenced by a volatile field. The easiest way to get volatile array semantics is to use the Atomic[Reference/Long/Integer]Array classes in java.util.concurrent.atomic, which provide volatile semantics for reads and writes of individual elements.

(Why not Float/Short/Double Array? With APIs, you never ask "why not", you ask "why". Meanwhile, you have 32- and 64-bit bit patterns, so the Float.floatToIntBits and Float.intBitsToFloat family of functions are your friends.)

These classes are somewhat problematic, though. If nothing else, you are endlessly boxing and unboxing values, which may make access expensive. Ugh — I really do know better than this, really! As a result, there is more to this story.

You may have noticed that I did provide a way of getting a volatile write above with just arrays: by writing out a self-reference. I have been asked if that technique can be leveraged to provide volatile access to array elements. Here's what that would look like:
// I wouldn't do this if I were you.
volatile int [] arr = new int[SIZE];

arr[0] = 1;
arr = arr;
This definitely does provide the volatile write. However, what good does it do you? The virtue of a volatile write is that a corresponding read can detect that it happened, and do something based on the new value. For example, you can use a volatile flag to force one thread to loop indefinitely until another one sets the flag. In this case, you can't actually detect that another thread performed the write, because it is writing the same value to the variable.

You can use sun.misc.Unsafe to emulate the behavior of a volatile array. But not everyone is working on a Sun VM. And they are trying to discourage the adoption of sun.misc.Unsafe, so I'm not going to put the code here.

Don't despair too much, though — Doug Lea and the clever folks involved in JSR-166 are working on better solutions for Java 7. More news as events warrant.

42 comments:

serega said...

>> you are endlessly boxing and
>> unboxing values, which may make
>> access expensive.
AtomicLongArray and AtomicIntegerArray store elements in a plain array of primitives, so there is no boxing and unboxing going on there.

Jeremy Manson said...

@serega - Jeremy == dumb

David Karnok said...

I read once in Doug's papers about creating a simple class containing a volatile field with the desired type. You pre-initialize an array with its instances and now you can access the inner volatile fields just through one indirection.

Jeremy Manson said...

@Karnok - Yes, but that is probably less efficient than AtomicBlahArray.

Anonymous said...

Actually, the story goes deeper than that! The self assignment: arr=arr doesn't guarantee a volatile write. The VM is perfectly within rights to remove this line as an optimisation, since it doesn't do anything from a program semantics stand point. When executed, the volatile write occurs, but the VM may decide to remove the instruction.

Jeremy Manson said...

@anonymous - Nope. A VM is free to remove the actual store operation, but it isn't free to remove the memory coherence operation. So you still get the volatile happens-before semantics.

When we were writing the semantics, we were very careful to distinguish a happens-before relationship's being established by seeing a particular value written, and a happens-before relationship's being established by seeing a particular write. They are always established by seeing a particular write. You can conceptualize this by assigning each write a unique global id (one early version of the semantics did this, but we realized it was overkill).

Sanjay said...

Jeremy, lets say I am using an array for fixed number objects to be concurrently accessed. Threads can read/store an element by providing index to the array. If we have an additional voaltile field like a counter, this field could be incremented at the end of a write operation just after the execlusive lock is released ( writer operations are serialized through the lock ). Read threads will read this counter before accessing the array elements. This will gurantee that stored objects references are visible to the read thread. Only problem is that there could be race while accessing the member variables of the stored object since read and write can happen concurrently ( read does not use lock ). This could be addressed depending on the requirement. Read can detect if a concurrent write is happening ( by using the counter as a version number ) and retry. Another alternative is to have all referneces / fields within the stored objects graph are volatile. Do you think this solution is faster then volatile array especially if reads vastly outnumbers writes ( since read thread are not using locks and hence no volatile write happends doing read threads execution.

Sanjay said...

small correction to my earlier post, [this field could be incremented at the end of a write operation just after the execlusive lock is released]. Actually the field should be incremented before the lock is released.

Jeremy Manson said...

@sanjay - how does the reader know which value the counter should have?

Also, you can't use a volatile field for a counter, because the increments aren't atomic. You have to use an AtomicInteger.

Finally, if reads vastly outnumber writes, it might be worth your while to try the CopyOnWriteArrayList strategy - basically, you have a volatile reference to an immutable array. If you see the array, you see updated versions of all of the contents. If you need to mutate, you copy the entire contents of the array. I've used this with a great deal of success in a number of situations.

Sanjay said...

[Finally, if reads...]
Ok. Also it gives consistent view at any given time. My only concern was that the whole Array of ref, is recycled for single write.

[Also, you can't use a volatile..]What if I increment the counter after the execusive lock ( used by writes ) is acquired. ( agreed that I was not clear on when the increment will happen ).

[how does the reader know ..]
I missed out some details. This is in more detail..
Say when the a writer acquires a lock, it will increment counter by One. When it leaves ( just before the lock is released ), increments by one again. Assuming that the counter is initialized to 0, odd number means that a writer is updating and even number means that no update is going on.
A reader thread,will first check the counter, if it is an odd number, it could just do busy wait till counter is even.

Once a reader thread has access ( even number counter ), after reading the data and making a copy, it will recheck the counter. If the value has changed, then ( even if it is even number since a write thread might have come - after reader reads the counter, update the state and leave before reader is done reading ) reader will retry.
This approach may be worth only if data objects are managed by a framework ( some in-memory-database ) which could internally use similar technique.

Jeremy Manson said...

@sanjay - First of all, you would be swamped by the overhead of acquiring and releasing all of those locks, as well as the spin loop.

More importantly, there is nothing preventing the counter from being incremented immediately after you check the value of the counter, but before you actually check the read.

To short-circuit this discussion a little, you are sort of gradually moving towards a pattern AtomicReferenceArrays support:

AtomicReferenceArray<Object> array = ...

do {
  Object o = array.get(i);
  // Do some stuff with o, produce
  // Object p as a result
} while (!array.compareAndSwap(i, o, p));

This will update array[i] from o to p, but only when there haven't been any intervening writes to array[i].

Unknown said...

About the article:
When you do:
arr[0] = 1; arr = arr;, the volatile write cannot be detected, but makes the write to arr[0] detectable to anybody reading arr, so reading arr[0] (in a loop in another thread, say) will work, obviously. So, what am I missing here?

@Manson,sanjay: about previous discussion:
1st, this scheme is used within the Linux kernel (include/linux/seqlock.h). The counter is updated within the write spinlock.
"More importantly, there is nothing preventing the counter from being incremented immediately after you check the value of the counter, but before you actually check the read."
It's not clear what you refer to. If you mean "before you perform the read", that will cause the reader to retry the read.
In this case, you have to retry the read one more time, which is safe. If reads outnumber writes, this will work. You wait to start the read until no write is in progress, and after reading the data (and copying it somewhere else) you recheck that the sequence counter didn't change.

A bigger problem is that you need that the first write to the counter (the one that makes it odd) is visible to readers before the other writes in the writer critical section, and this implies another "acquire" operation (including a volatile read); on the read side, you need also to have barriers between the counter reads and the data reads.

Jeremy Manson said...

@paolo - the crucial point here is that there is no way to know whether you are reading the value of the array element before or after the volatile write has occurred. If the volatile write hasn't occurred, you still *might* see the correct value of the array element, but you wouldn't get the protection the volatile affords you.

This may not seem like a big problem, but once you add enough threads / processors / race conditions, it certainly can be. Remember that something that happens one time in a billion happens twice a second per core on modern processors.

I think I misinterpreted what Sanjay said; I was imagining a case like this:

writer:
shared_lock()
r1 = counter;
counter = r1 + 1;
// write
r2 = counter;
counter = r2 + 1;
shared_unlock()

reader:
do {
  r3 = counter;
} while (r3 % 2 == 0);
// read

This doesn't really make any sense, as you can plainly see. The seqlock is something that does make sense, and was, in fact, something we designed for (See the discussion of optimistic readers here).

Unknown said...

About volatile arrays: I now do understand the point, and it _is_ quite subtle.

@Sanjay, JManson:
About seqlocks: the read loop there is correct (even if it's missing in the mail you mention), what is missing is just the check after the read.

Anyway, to sum up, making that code work in Java is quite subtle (shortly: don't do that). I'm referring here to:
"Read can detect if a concurrent write is happening ( by using the counter as a version number ) and retry."

Here there's an alternative suggestion that may or may not fit your usage case:
http://www.cs.umd.edu/~pugh/java/memoryModel/archive/0570.html

I've read the mail, and part of the (skimming some sections) and I see mentions of the same problems... but I wouldn't say "we designed for this" since this solution can't be written cleanly (I read your sentence as "we discussed this usage when writing the model"). In fact, the thread goes on with rebuttals of the idiom:
http://www.cs.umd.edu/~pugh/java/memoryModel/archive/0566.html

A question: what is the current JMM, is it "Bill's model" or is it CRF?

Unknown said...

Another (unrelated to this post) question, about JMM papers: this page links to a paper on your old home page, which is gone (http://www.cs.umd.edu/users/jmanson/java/journal.pdf). Is that paper available somewhere? Could you ask the author to fix the page?

Jeremy Manson said...

@Paolo - Yes, I stated myself badly. Thank you for clarifying.

The current model is neither CRF nor Bill's then-current model, it is something Bill, Sarita Adve and I worked up later. It is still the case that the non-volatile writes can be moved around, though.

Oleg said...

use AtomicIntegerArray or AtomicLongArray :)

The AtomicIntegerArray class implements an int array whose individual fields can be accessed with volatile semantics, via the class's get() and set() methods. Calling arr.set(x, y) from one thread will then guarantee that another thread calling arr.get(x) will read the value y (until another value is read to position x).

http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicIntegerArray.html http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicLongArray.html http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/package-summary.html

Jeremy Manson said...

@uthark - the post does make mention of those classes.

Anonymous said...

volatile int[] array = new int[] {0};
public synchronized void set() {
  array[0] = 1;
}
public int get() {
return array[0];
}

Is it possible for the "get" thread to read 0 after the "set" thread has finished? The unlock stores/writes the 1 into main memory (http://java.sun.com/docs/books/jvms/second_edition/html/Threads.doc.html#22253), but the volatile read of the array reference does not necessarily read/load the [0] int into the "get" thread's working memory. Or does it? Does a volatile read provide a "memory barrier" (http://www.javamex.com/tutorials/synchronization_volatile_java_5.shtml)?

Anonymous said...

Don't bother about my last question. I see now, it's more about memory action reordering, than synchronizing actual copies of memory. There are only reads on the array reference itself and "volatile" does not ensure the order of multiple reads without writes.

Would it work with an additional "array = array;" in set()?

Is there another way to define a memory barrier without locking? probably something like Thread.MemoryBarrier() in .NET.

Jeremy Manson said...

@bruno - array = array would provide a memory barrier, but you couldn't guarantee that the reader thread's code would occur after that assignment happened. Usually, if you use a volatile as flags, you have some latch-like behavior:

Thread 1:
data = o;
done = true;

Thread 2:
if (done)
// use data

But you can't check the value of array, because it will be the same regardless of whether you have done the assignment or not.

Annotating a field "volatile" provides a memory barrier without locking, as does using most of the utilities in java.util.concurrent.atomic.

Robert DiFalco said...

Don't know if you are still reading this but I am assuming that the semantics for object fields are just like the semantics for an array?

In other words, if I have a class Persom with a field #name that if #name is not volatile then having Person instances volatile will not garauntee visibility of writes to the name field. Is that right?

Jeremy Manson said...

@Robert - that's right. To get publication guarantees for #name, you need to write a reference to the Person object.

Robert DiFalco said...

It's beginning to sound like you should never NOT use volatile except for immutable classes or local variables.

Is there an extra cost with synchronized blocks if the field being accessed is volatile?

Jeremy Manson said...

@Robert - There is a cost associated with using volatile fields. It prevents common compiler optimizations, and requires hardware-level memory coherence operations (x86 only requires memory coherence operations for writes, so you see a lot of volatile-based data structures optimized for reads).

Unknown said...

Hi Jeremy,

First off, good article, it was an informative read.

I do have one comment though concerning writing a self reference to the array as a method of getting a volatile write.

I think that there is a use for such a construct in a situation where an array is being used as a cache. From my understanding of the Java memory model, there is no guarantee that thread A will ever see non-volatile or non-synchronized writes to a field by thread B. So if you want a non-synchronized cache that is backed by an array, you must write the self reference to ensure that updates are propagated (even if their timeliness is not critical).

Now in this situation it may be clearer to synchronize the read and writes, but there will probably be better performance using the volatile semantics.

Thoughts?

Jeremy Manson said...

That is true, although in practice, you will probably see most writes at some point.

Anonymous said...

Hi, you say "In this case, you can't actually detect that another thread performed the write, because it is writing the same value to the variable."
But what if a new value was written? It would be actually detected.
The behaviour is the same using a volatile field.

What about this pattern (definitely not a best practice)?

volatile int [] arr = ...;

// T1 (writer)
arr[0] = 1;
arr = arr;

// T2 (reader)
int value = arr[0];

T2 executes a volatile read (arr), so the happens-before relationship guarantees that the element of the array could be considered "volatile".

Where am I wrong? Thanks for your time!

Jeremy Manson said...

@anonymous: If the reader sees the updated value, it has no idea whether the volatile write might have happened (i.e., it can read the value 1 after arr[0] = 1, but before arr = arr). This isn't a big problem with reading a scalar, but it would be a big problem if you were relying on the happens-before relationship for something.

~SS~ said...

Please tell me; is it volatile object array is possible with AtomicReferenceArray or not?

Jeremy Manson said...

@~SS~ - This is answered in the post.

Unknown said...

As per the explanation here thread1 should not be able to read the updated value from boolean[] array but that's not happening here i.e. thread1 reads the updated value and stops.

Am I missing something here?

class ThreadHolder {
boolean[] flag = new boolean[]{true};

class Thread1 implements Runnable {
public void run() {
int x = 0;
System.out.println("Thread t1 starts");
while (flag[0]) {
x++; // do some work
}
System.out.println("Thread t1 done");
}
}

class Thread2 implements Runnable {
public void run() {
System.out.println("Thread t2 starts");
flag[0]=false;
System.out.println("Thread t2 done");
}
}
}

Jeremy Manson said...

@Sachin: Yes. The absence of a guarantee does not mean the guarantee of absence.

To put it another way, just because you aren't guaranteed to see the updated value, that doesn't mean you won't. It just means you there's no guarantee. You may or may not see it.

damluar said...

Hello Jeremy,

thanks for your post!

So if we use arr = arr, we are guaranteed to see the value that was written, but we might not get the happens-before (and see stale fields of the written object). Is it correct?

I'm going through "Concurrency in Practice" and I tried to implement Game of Life using multiple threads and CyclicBarrier. Since each thread modifies its own part of the array I don't want to lock the whole array. But then when we print each generation (using barrier action) we might see stale values. So your idea of arr = arr might be useful here.

Regards,
Konstantin.

Jeremy Manson said...

No, volatile is volatile. The question is really: how do you know that you've executed the arr = arr statement and can rely on the volatile guarantee?

More to the point, just use AtomicReferenceArray. Not sure why I didn't mention it here.

Durante said...

Nice post, thanks for sharing...

Unknown said...

Hi Jeremy,
I understand that Thread2 sees the change of the variable ready. But why does it also see the change of variable answer, although it isnt volatile.

Jeremy Manson said...

@anonymous Because all writes of normal variables that happen before a write of a volatile are deemed also to have happened before a read of that value from that volatile, even if it is in another thread. The write and subsequent read create an ordering between the two. Those are the rules, and Java implementations have to find ways to follow them.

Anonymous said...

Writer thread:
arr[0] = 1;
arr = arr;

Reader thread:
int one = arr[0];

Let’s assume that a read always goes after arr = arr; has been executed.
arr = arr; happens before (happens before relationship) arr (reading arr reference) from the reader thread. Therefore arr[0] = 1; is also visible from the reader thread and one == 1. Am I right with my reasoning?

Jeremy Manson said...

@anonymous: sure, but how can you tell whether the arr = arr line has executed? arr will have the same value either way.

Anonymous said...

volatile int[] array = new int[] {0};
public synchronized void set() {
array[0] = 1;
}
public int get() {
return array[0];
}

Is it possible for the "get" thread to read 0 after the "set" thread has finished?
I am trying to understand whether happens before relationship means just a flush to the main memory and reordering restrictions or it is something more complicated. Happens before relationship is described in the documentation but I was not able to find any information about the main memory reads and flushes in the documentation. If it is impossible to read 0 then is it correct to think that everything before unlock flushes to the main memory and everything after lock is read from the main memory and in a similar fashion with volatile.

Jeremy Manson said...

Common mistake - there is no volatile write in set(). There's a volatile read of the array variable, followed by a normal write of 1 to element 0. So, there is no happens-before relationship between the set and the get.

If you want arrays whose elements can be read and written atomically, use AtomicXArrays from java.util.concurrent.atomic.

HB relationships are often implemented with memory barriers (the compiler also avoids reordering across those barriers). The spec describes the behavior the JVM needs to observe, it doesn't dictate implementation choice. Precisely how it is implemented depends on architecture, compiler, and various other factors. For example, if the JVM can detect that the variable is never read by another thread, then it can avoid memory barrier operations entirely.

If you are interested in how processors might implement this, see https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html. That doc is about C++, but volatiles are basically C++ atomics with seq cst semantics.