Wednesday, June 17, 2009

Volatile Arrays in Java

I get asked a lot about how the volatile keyword interacts with arrays, so it is probably worth a blog post on the subject.

Those of you who have read my posts on volatile (Volatile Fields and Synchronization, Volatile Does Not Mean Atomic and, most importantly, What Volatile Means in Java) will have a pretty good idea of what volatile means, but it is probably worth it to provide a reminder.

Basically, if you write to a volatile field, and then you have a later read that sees that write, then the actions that happened before that write are guaranteed to be ordered before and visible to the actions that happen after the read. In practice, what this means is that the compiler and the processor can't do any sneaky reordering to move actions that come before the write to after it, or actions that come after the write to before it. See my post on What Volatile Means in Java for more detail.

With that out of the way, let's go through some examples of what you can do with volatile arrays:
volatile int [] arr = new int[SIZE];

arr = arr;
int x = arr[0];
arr[0] = 1;
The first lesson to learn, which will guide us here, is that arr is a volatile reference to an array, not a reference to an array of volatile elements! As a result, the write to arr[0] is not a volatile write. If you write to arr[0], you will get none of the ordering or visibility guarantees that you want from a volatile write.

What examples are there of a volatile write in the code above? Well, both of the writes to arr — the self-assignment and the write of new int[SIZE] — are volatile writes, because they are writing to arr, not one of its elements.

That explains where the volatile writes are. Where are the volatile reads in our example? It turns out that each of the lines after the declaration contains a volatile read:
arr = arr
This one is easy. The volatile read is on the right hand side of the assignment statement.
int x = arr[0];
This one is slightly more subtle. The volatile read is not the read of the array element. The right hand side of that assignment is a two step process. First, you read the array reference, then you read the 0th element of that array. The volatile read is the read of the array reference, not the read of the 0th element.
arr[0] = 1;
The previous example should give you a hint of where the volatile read is on this line. As in that example, the left-hand side is a two step process. First, you read the array reference, then you assign to the 0th element of that array. As odd as it seems, the read of the array reference is a volatile read.

The astute reader will notice that there is no actual way to get volatile write semantics by writing to an element of an array referenced by a volatile field. The easiest way to get volatile array semantics is to use the Atomic[Reference/Long/Integer]Array classes in java.util.concurrent.atomic, which provide volatile semantics for reads and writes of individual elements.

(Why not Float/Short/Double Array? With APIs, you never ask "why not", you ask "why". Meanwhile, you have 32- and 64-bit bit patterns, so the Float.floatToIntBits and Float.intBitsToFloat family of functions are your friends.)

These classes are somewhat problematic, though. If nothing else, you are endlessly boxing and unboxing values, which may make access expensive. Ugh — I really do know better than this, really! As a result, there is more to this story.

You may have noticed that I did provide a way of getting a volatile write above with just arrays: by writing out a self-reference. I have been asked if that technique can be leveraged to provide volatile access to array elements. Here's what that would look like:
// I wouldn't do this if I were you.
volatile int [] arr = new int[SIZE];

arr[0] = 1;
arr = arr;
This definitely does provide the volatile write. However, what good does it do you? The virtue of a volatile write is that a corresponding read can detect that it happened, and do something based on the new value. For example, you can use a volatile flag to force one thread to loop indefinitely until another one sets the flag. In this case, you can't actually detect that another thread performed the write, because it is writing the same value to the variable.

You can use sun.misc.Unsafe to emulate the behavior of a volatile array. But not everyone is working on a Sun VM. And they are trying to discourage the adoption of sun.misc.Unsafe, so I'm not going to put the code here.

Don't despair too much, though — Doug Lea and the clever folks involved in JSR-166 are working on better solutions for Java 7. More news as events warrant.

29 comments:

serega said...

>> you are endlessly boxing and
>> unboxing values, which may make
>> access expensive.
AtomicLongArray and AtomicIntegerArray store elements in a plain array of primitives, so there is no boxing and unboxing going on there.

Jeremy Manson said...

@serega - Jeremy == dumb

Karnok Dávid said...

I read once in Doug's papers about creating a simple class containing a volatile field with the desired type. You pre-initialize an array with its instances and now you can access the inner volatile fields just through one indirection.

Jeremy Manson said...

@Karnok - Yes, but that is probably less efficient than AtomicBlahArray.

Anonymous said...

Actually, the story goes deeper than that! The self assignment: arr=arr doesn't guarantee a volatile write. The VM is perfectly within rights to remove this line as an optimisation, since it doesn't do anything from a program semantics stand point. When executed, the volatile write occurs, but the VM may decide to remove the instruction.

Jeremy Manson said...

@anonymous - Nope. A VM is free to remove the actual store operation, but it isn't free to remove the memory coherence operation. So you still get the volatile happens-before semantics.

When we were writing the semantics, we were very careful to distinguish a happens-before relationship's being established by seeing a particular value written, and a happens-before relationship's being established by seeing a particular write. They are always established by seeing a particular write. You can conceptualize this by assigning each write a unique global id (one early version of the semantics did this, but we realized it was overkill).

Sanjay said...

Jeremy, lets say I am using an array for fixed number objects to be concurrently accessed. Threads can read/store an element by providing index to the array. If we have an additional voaltile field like a counter, this field could be incremented at the end of a write operation just after the execlusive lock is released ( writer operations are serialized through the lock ). Read threads will read this counter before accessing the array elements. This will gurantee that stored objects references are visible to the read thread. Only problem is that there could be race while accessing the member variables of the stored object since read and write can happen concurrently ( read does not use lock ). This could be addressed depending on the requirement. Read can detect if a concurrent write is happening ( by using the counter as a version number ) and retry. Another alternative is to have all referneces / fields within the stored objects graph are volatile. Do you think this solution is faster then volatile array especially if reads vastly outnumbers writes ( since read thread are not using locks and hence no volatile write happends doing read threads execution.

Sanjay said...

small correction to my earlier post, [this field could be incremented at the end of a write operation just after the execlusive lock is released]. Actually the field should be incremented before the lock is released.

Jeremy Manson said...

@sanjay - how does the reader know which value the counter should have?

Also, you can't use a volatile field for a counter, because the increments aren't atomic. You have to use an AtomicInteger.

Finally, if reads vastly outnumber writes, it might be worth your while to try the CopyOnWriteArrayList strategy - basically, you have a volatile reference to an immutable array. If you see the array, you see updated versions of all of the contents. If you need to mutate, you copy the entire contents of the array. I've used this with a great deal of success in a number of situations.

Sanjay said...

[Finally, if reads...]
Ok. Also it gives consistent view at any given time. My only concern was that the whole Array of ref, is recycled for single write.

[Also, you can't use a volatile..]What if I increment the counter after the execusive lock ( used by writes ) is acquired. ( agreed that I was not clear on when the increment will happen ).

[how does the reader know ..]
I missed out some details. This is in more detail..
Say when the a writer acquires a lock, it will increment counter by One. When it leaves ( just before the lock is released ), increments by one again. Assuming that the counter is initialized to 0, odd number means that a writer is updating and even number means that no update is going on.
A reader thread,will first check the counter, if it is an odd number, it could just do busy wait till counter is even.

Once a reader thread has access ( even number counter ), after reading the data and making a copy, it will recheck the counter. If the value has changed, then ( even if it is even number since a write thread might have come - after reader reads the counter, update the state and leave before reader is done reading ) reader will retry.
This approach may be worth only if data objects are managed by a framework ( some in-memory-database ) which could internally use similar technique.

Jeremy Manson said...

@sanjay - First of all, you would be swamped by the overhead of acquiring and releasing all of those locks, as well as the spin loop.

More importantly, there is nothing preventing the counter from being incremented immediately after you check the value of the counter, but before you actually check the read.

To short-circuit this discussion a little, you are sort of gradually moving towards a pattern AtomicReferenceArrays support:

AtomicReferenceArray<Object> array = ...

do {
  Object o = array.get(i);
  // Do some stuff with o, produce
  // Object p as a result
} while (!array.compareAndSwap(i, o, p));

This will update array[i] from o to p, but only when there haven't been any intervening writes to array[i].

Paolo Giarrusso said...

About the article:
When you do:
arr[0] = 1; arr = arr;, the volatile write cannot be detected, but makes the write to arr[0] detectable to anybody reading arr, so reading arr[0] (in a loop in another thread, say) will work, obviously. So, what am I missing here?

@Manson,sanjay: about previous discussion:
1st, this scheme is used within the Linux kernel (include/linux/seqlock.h). The counter is updated within the write spinlock.
"More importantly, there is nothing preventing the counter from being incremented immediately after you check the value of the counter, but before you actually check the read."
It's not clear what you refer to. If you mean "before you perform the read", that will cause the reader to retry the read.
In this case, you have to retry the read one more time, which is safe. If reads outnumber writes, this will work. You wait to start the read until no write is in progress, and after reading the data (and copying it somewhere else) you recheck that the sequence counter didn't change.

A bigger problem is that you need that the first write to the counter (the one that makes it odd) is visible to readers before the other writes in the writer critical section, and this implies another "acquire" operation (including a volatile read); on the read side, you need also to have barriers between the counter reads and the data reads.

Jeremy Manson said...

@paolo - the crucial point here is that there is no way to know whether you are reading the value of the array element before or after the volatile write has occurred. If the volatile write hasn't occurred, you still *might* see the correct value of the array element, but you wouldn't get the protection the volatile affords you.

This may not seem like a big problem, but once you add enough threads / processors / race conditions, it certainly can be. Remember that something that happens one time in a billion happens twice a second per core on modern processors.

I think I misinterpreted what Sanjay said; I was imagining a case like this:

writer:
shared_lock()
r1 = counter;
counter = r1 + 1;
// write
r2 = counter;
counter = r2 + 1;
shared_unlock()

reader:
do {
  r3 = counter;
} while (r3 % 2 == 0);
// read

This doesn't really make any sense, as you can plainly see. The seqlock is something that does make sense, and was, in fact, something we designed for (See the discussion of optimistic readers here).

Paolo Giarrusso said...

About volatile arrays: I now do understand the point, and it _is_ quite subtle.

@Sanjay, JManson:
About seqlocks: the read loop there is correct (even if it's missing in the mail you mention), what is missing is just the check after the read.

Anyway, to sum up, making that code work in Java is quite subtle (shortly: don't do that). I'm referring here to:
"Read can detect if a concurrent write is happening ( by using the counter as a version number ) and retry."

Here there's an alternative suggestion that may or may not fit your usage case:
http://www.cs.umd.edu/~pugh/java/memoryModel/archive/0570.html

I've read the mail, and part of the (skimming some sections) and I see mentions of the same problems... but I wouldn't say "we designed for this" since this solution can't be written cleanly (I read your sentence as "we discussed this usage when writing the model"). In fact, the thread goes on with rebuttals of the idiom:
http://www.cs.umd.edu/~pugh/java/memoryModel/archive/0566.html

A question: what is the current JMM, is it "Bill's model" or is it CRF?

Paolo Giarrusso said...

Another (unrelated to this post) question, about JMM papers: this page links to a paper on your old home page, which is gone (http://www.cs.umd.edu/users/jmanson/java/journal.pdf). Is that paper available somewhere? Could you ask the author to fix the page?

Jeremy Manson said...

@Paolo - Yes, I stated myself badly. Thank you for clarifying.

The current model is neither CRF nor Bill's then-current model, it is something Bill, Sarita Adve and I worked up later. It is still the case that the non-volatile writes can be moved around, though.

uthark said...

use AtomicIntegerArray or AtomicLongArray :)

The AtomicIntegerArray class implements an int array whose individual fields can be accessed with volatile semantics, via the class's get() and set() methods. Calling arr.set(x, y) from one thread will then guarantee that another thread calling arr.get(x) will read the value y (until another value is read to position x).

http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicIntegerArray.html http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicLongArray.html http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/package-summary.html

Jeremy Manson said...

@uthark - the post does make mention of those classes.

bruno said...

volatile int[] array = new int[] {0};
public synchronized void set() {
  array[0] = 1;
}
public int get() {
return array[0];
}

Is it possible for the "get" thread to read 0 after the "set" thread has finished? The unlock stores/writes the 1 into main memory (http://java.sun.com/docs/books/jvms/second_edition/html/Threads.doc.html#22253), but the volatile read of the array reference does not necessarily read/load the [0] int into the "get" thread's working memory. Or does it? Does a volatile read provide a "memory barrier" (http://www.javamex.com/tutorials/synchronization_volatile_java_5.shtml)?

bruno said...

Don't bother about my last question. I see now, it's more about memory action reordering, than synchronizing actual copies of memory. There are only reads on the array reference itself and "volatile" does not ensure the order of multiple reads without writes.

Would it work with an additional "array = array;" in set()?

Is there another way to define a memory barrier without locking? probably something like Thread.MemoryBarrier() in .NET.

Jeremy Manson said...

@bruno - array = array would provide a memory barrier, but you couldn't guarantee that the reader thread's code would occur after that assignment happened. Usually, if you use a volatile as flags, you have some latch-like behavior:

Thread 1:
data = o;
done = true;

Thread 2:
if (done)
// use data

But you can't check the value of array, because it will be the same regardless of whether you have done the assignment or not.

Annotating a field "volatile" provides a memory barrier without locking, as does using most of the utilities in java.util.concurrent.atomic.

Robert DiFalco said...

Don't know if you are still reading this but I am assuming that the semantics for object fields are just like the semantics for an array?

In other words, if I have a class Persom with a field #name that if #name is not volatile then having Person instances volatile will not garauntee visibility of writes to the name field. Is that right?

Jeremy Manson said...

@Robert - that's right. To get publication guarantees for #name, you need to write a reference to the Person object.

Robert DiFalco said...

It's beginning to sound like you should never NOT use volatile except for immutable classes or local variables.

Is there an extra cost with synchronized blocks if the field being accessed is volatile?

Jeremy Manson said...

@Robert - There is a cost associated with using volatile fields. It prevents common compiler optimizations, and requires hardware-level memory coherence operations (x86 only requires memory coherence operations for writes, so you see a lot of volatile-based data structures optimized for reads).

Trevor said...

Hi Jeremy,

First off, good article, it was an informative read.

I do have one comment though concerning writing a self reference to the array as a method of getting a volatile write.

I think that there is a use for such a construct in a situation where an array is being used as a cache. From my understanding of the Java memory model, there is no guarantee that thread A will ever see non-volatile or non-synchronized writes to a field by thread B. So if you want a non-synchronized cache that is backed by an array, you must write the self reference to ensure that updates are propagated (even if their timeliness is not critical).

Now in this situation it may be clearer to synchronize the read and writes, but there will probably be better performance using the volatile semantics.

Thoughts?

Jeremy Manson said...

That is true, although in practice, you will probably see most writes at some point.

Anonymous said...

Hi, you say "In this case, you can't actually detect that another thread performed the write, because it is writing the same value to the variable."
But what if a new value was written? It would be actually detected.
The behaviour is the same using a volatile field.

What about this pattern (definitely not a best practice)?

volatile int [] arr = ...;

// T1 (writer)
arr[0] = 1;
arr = arr;

// T2 (reader)
int value = arr[0];

T2 executes a volatile read (arr), so the happens-before relationship guarantees that the element of the array could be considered "volatile".

Where am I wrong? Thanks for your time!

Jeremy Manson said...

@anonymous: If the reader sees the updated value, it has no idea whether the volatile write might have happened (i.e., it can read the value 1 after arr[0] = 1, but before arr = arr). This isn't a big problem with reading a scalar, but it would be a big problem if you were relying on the happens-before relationship for something.