Wednesday, June 17, 2009

Volatile Arrays in Java

I get asked a lot about how the volatile keyword interacts with arrays, so it is probably worth a blog post on the subject.

Those of you who have read my posts on volatile (Volatile Fields and Synchronization, Volatile Does Not Mean Atomic and, most importantly, What Volatile Means in Java) will have a pretty good idea of what volatile means, but it is probably worth it to provide a reminder.

Basically, if you write to a volatile field, and then you have a later read that sees that write, then the actions that happened before that write are guaranteed to be ordered before and visible to the actions that happen after the read. In practice, what this means is that the compiler and the processor can't do any sneaky reordering to move actions that come before the write to after it, or actions that come after the write to before it. See my post on What Volatile Means in Java for more detail.

With that out of the way, let's go through some examples of what you can do with volatile arrays:
volatile int [] arr = new int[SIZE];

arr = arr;
int x = arr[0];
arr[0] = 1;
The first lesson to learn, which will guide us here, is that arr is a volatile reference to an array, not a reference to an array of volatile elements! As a result, the write to arr[0] is not a volatile write. If you write to arr[0], you will get none of the ordering or visibility guarantees that you want from a volatile write.

What examples are there of a volatile write in the code above? Well, both of the writes to arr — the self-assignment and the write of new int[SIZE] — are volatile writes, because they are writing to arr, not one of its elements.

That explains where the volatile writes are. Where are the volatile reads in our example? It turns out that each of the lines after the declaration contains a volatile read:
arr = arr
This one is easy. The volatile read is on the right hand side of the assignment statement.
int x = arr[0];
This one is slightly more subtle. The volatile read is not the read of the array element. The right hand side of that assignment is a two step process. First, you read the array reference, then you read the 0th element of that array. The volatile read is the read of the array reference, not the read of the 0th element.
arr[0] = 1;
The previous example should give you a hint of where the volatile read is on this line. As in that example, the left-hand side is a two step process. First, you read the array reference, then you assign to the 0th element of that array. As odd as it seems, the read of the array reference is a volatile read.

The astute reader will notice that there is no actual way to get volatile write semantics by writing to an element of an array referenced by a volatile field. The easiest way to get volatile array semantics is to use the Atomic[Reference/Long/Integer]Array classes in java.util.concurrent.atomic, which provide volatile semantics for reads and writes of individual elements.

(Why not Float/Short/Double Array? With APIs, you never ask "why not", you ask "why". Meanwhile, you have 32- and 64-bit bit patterns, so the Float.floatToIntBits and Float.intBitsToFloat family of functions are your friends.)

These classes are somewhat problematic, though. If nothing else, you are endlessly boxing and unboxing values, which may make access expensive. Ugh — I really do know better than this, really! As a result, there is more to this story.

You may have noticed that I did provide a way of getting a volatile write above with just arrays: by writing out a self-reference. I have been asked if that technique can be leveraged to provide volatile access to array elements. Here's what that would look like:
// I wouldn't do this if I were you.
volatile int [] arr = new int[SIZE];

arr[0] = 1;
arr = arr;
This definitely does provide the volatile write. However, what good does it do you? The virtue of a volatile write is that a corresponding read can detect that it happened, and do something based on the new value. For example, you can use a volatile flag to force one thread to loop indefinitely until another one sets the flag. In this case, you can't actually detect that another thread performed the write, because it is writing the same value to the variable.

You can use sun.misc.Unsafe to emulate the behavior of a volatile array. But not everyone is working on a Sun VM. And they are trying to discourage the adoption of sun.misc.Unsafe, so I'm not going to put the code here.

Don't despair too much, though — Doug Lea and the clever folks involved in JSR-166 are working on better solutions for Java 7. More news as events warrant.

42 comments:

  1. >> you are endlessly boxing and
    >> unboxing values, which may make
    >> access expensive.
    AtomicLongArray and AtomicIntegerArray store elements in a plain array of primitives, so there is no boxing and unboxing going on there.

    ReplyDelete
  2. I read once in Doug's papers about creating a simple class containing a volatile field with the desired type. You pre-initialize an array with its instances and now you can access the inner volatile fields just through one indirection.

    ReplyDelete
  3. @Karnok - Yes, but that is probably less efficient than AtomicBlahArray.

    ReplyDelete
  4. Actually, the story goes deeper than that! The self assignment: arr=arr doesn't guarantee a volatile write. The VM is perfectly within rights to remove this line as an optimisation, since it doesn't do anything from a program semantics stand point. When executed, the volatile write occurs, but the VM may decide to remove the instruction.

    ReplyDelete
  5. @anonymous - Nope. A VM is free to remove the actual store operation, but it isn't free to remove the memory coherence operation. So you still get the volatile happens-before semantics.

    When we were writing the semantics, we were very careful to distinguish a happens-before relationship's being established by seeing a particular value written, and a happens-before relationship's being established by seeing a particular write. They are always established by seeing a particular write. You can conceptualize this by assigning each write a unique global id (one early version of the semantics did this, but we realized it was overkill).

    ReplyDelete
  6. Jeremy, lets say I am using an array for fixed number objects to be concurrently accessed. Threads can read/store an element by providing index to the array. If we have an additional voaltile field like a counter, this field could be incremented at the end of a write operation just after the execlusive lock is released ( writer operations are serialized through the lock ). Read threads will read this counter before accessing the array elements. This will gurantee that stored objects references are visible to the read thread. Only problem is that there could be race while accessing the member variables of the stored object since read and write can happen concurrently ( read does not use lock ). This could be addressed depending on the requirement. Read can detect if a concurrent write is happening ( by using the counter as a version number ) and retry. Another alternative is to have all referneces / fields within the stored objects graph are volatile. Do you think this solution is faster then volatile array especially if reads vastly outnumbers writes ( since read thread are not using locks and hence no volatile write happends doing read threads execution.

    ReplyDelete
  7. small correction to my earlier post, [this field could be incremented at the end of a write operation just after the execlusive lock is released]. Actually the field should be incremented before the lock is released.

    ReplyDelete
  8. @sanjay - how does the reader know which value the counter should have?

    Also, you can't use a volatile field for a counter, because the increments aren't atomic. You have to use an AtomicInteger.

    Finally, if reads vastly outnumber writes, it might be worth your while to try the CopyOnWriteArrayList strategy - basically, you have a volatile reference to an immutable array. If you see the array, you see updated versions of all of the contents. If you need to mutate, you copy the entire contents of the array. I've used this with a great deal of success in a number of situations.

    ReplyDelete
  9. [Finally, if reads...]
    Ok. Also it gives consistent view at any given time. My only concern was that the whole Array of ref, is recycled for single write.

    [Also, you can't use a volatile..]What if I increment the counter after the execusive lock ( used by writes ) is acquired. ( agreed that I was not clear on when the increment will happen ).

    [how does the reader know ..]
    I missed out some details. This is in more detail..
    Say when the a writer acquires a lock, it will increment counter by One. When it leaves ( just before the lock is released ), increments by one again. Assuming that the counter is initialized to 0, odd number means that a writer is updating and even number means that no update is going on.
    A reader thread,will first check the counter, if it is an odd number, it could just do busy wait till counter is even.

    Once a reader thread has access ( even number counter ), after reading the data and making a copy, it will recheck the counter. If the value has changed, then ( even if it is even number since a write thread might have come - after reader reads the counter, update the state and leave before reader is done reading ) reader will retry.
    This approach may be worth only if data objects are managed by a framework ( some in-memory-database ) which could internally use similar technique.

    ReplyDelete
  10. @sanjay - First of all, you would be swamped by the overhead of acquiring and releasing all of those locks, as well as the spin loop.

    More importantly, there is nothing preventing the counter from being incremented immediately after you check the value of the counter, but before you actually check the read.

    To short-circuit this discussion a little, you are sort of gradually moving towards a pattern AtomicReferenceArrays support:

    AtomicReferenceArray<Object> array = ...

    do {
      Object o = array.get(i);
      // Do some stuff with o, produce
      // Object p as a result
    } while (!array.compareAndSwap(i, o, p));

    This will update array[i] from o to p, but only when there haven't been any intervening writes to array[i].

    ReplyDelete
  11. About the article:
    When you do:
    arr[0] = 1; arr = arr;, the volatile write cannot be detected, but makes the write to arr[0] detectable to anybody reading arr, so reading arr[0] (in a loop in another thread, say) will work, obviously. So, what am I missing here?

    @Manson,sanjay: about previous discussion:
    1st, this scheme is used within the Linux kernel (include/linux/seqlock.h). The counter is updated within the write spinlock.
    "More importantly, there is nothing preventing the counter from being incremented immediately after you check the value of the counter, but before you actually check the read."
    It's not clear what you refer to. If you mean "before you perform the read", that will cause the reader to retry the read.
    In this case, you have to retry the read one more time, which is safe. If reads outnumber writes, this will work. You wait to start the read until no write is in progress, and after reading the data (and copying it somewhere else) you recheck that the sequence counter didn't change.

    A bigger problem is that you need that the first write to the counter (the one that makes it odd) is visible to readers before the other writes in the writer critical section, and this implies another "acquire" operation (including a volatile read); on the read side, you need also to have barriers between the counter reads and the data reads.

    ReplyDelete
  12. @paolo - the crucial point here is that there is no way to know whether you are reading the value of the array element before or after the volatile write has occurred. If the volatile write hasn't occurred, you still *might* see the correct value of the array element, but you wouldn't get the protection the volatile affords you.

    This may not seem like a big problem, but once you add enough threads / processors / race conditions, it certainly can be. Remember that something that happens one time in a billion happens twice a second per core on modern processors.

    I think I misinterpreted what Sanjay said; I was imagining a case like this:

    writer:
    shared_lock()
    r1 = counter;
    counter = r1 + 1;
    // write
    r2 = counter;
    counter = r2 + 1;
    shared_unlock()

    reader:
    do {
      r3 = counter;
    } while (r3 % 2 == 0);
    // read

    This doesn't really make any sense, as you can plainly see. The seqlock is something that does make sense, and was, in fact, something we designed for (See the discussion of optimistic readers here).

    ReplyDelete
  13. About volatile arrays: I now do understand the point, and it _is_ quite subtle.

    @Sanjay, JManson:
    About seqlocks: the read loop there is correct (even if it's missing in the mail you mention), what is missing is just the check after the read.

    Anyway, to sum up, making that code work in Java is quite subtle (shortly: don't do that). I'm referring here to:
    "Read can detect if a concurrent write is happening ( by using the counter as a version number ) and retry."

    Here there's an alternative suggestion that may or may not fit your usage case:
    http://www.cs.umd.edu/~pugh/java/memoryModel/archive/0570.html

    I've read the mail, and part of the (skimming some sections) and I see mentions of the same problems... but I wouldn't say "we designed for this" since this solution can't be written cleanly (I read your sentence as "we discussed this usage when writing the model"). In fact, the thread goes on with rebuttals of the idiom:
    http://www.cs.umd.edu/~pugh/java/memoryModel/archive/0566.html

    A question: what is the current JMM, is it "Bill's model" or is it CRF?

    ReplyDelete
  14. Another (unrelated to this post) question, about JMM papers: this page links to a paper on your old home page, which is gone (http://www.cs.umd.edu/users/jmanson/java/journal.pdf). Is that paper available somewhere? Could you ask the author to fix the page?

    ReplyDelete
  15. @Paolo - Yes, I stated myself badly. Thank you for clarifying.

    The current model is neither CRF nor Bill's then-current model, it is something Bill, Sarita Adve and I worked up later. It is still the case that the non-volatile writes can be moved around, though.

    ReplyDelete
  16. use AtomicIntegerArray or AtomicLongArray :)

    The AtomicIntegerArray class implements an int array whose individual fields can be accessed with volatile semantics, via the class's get() and set() methods. Calling arr.set(x, y) from one thread will then guarantee that another thread calling arr.get(x) will read the value y (until another value is read to position x).

    http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicIntegerArray.html http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/AtomicLongArray.html http://java.sun.com/j2se/1.5.0/docs/api/java/util/concurrent/atomic/package-summary.html

    ReplyDelete
  17. @uthark - the post does make mention of those classes.

    ReplyDelete
  18. volatile int[] array = new int[] {0};
    public synchronized void set() {
      array[0] = 1;
    }
    public int get() {
    return array[0];
    }

    Is it possible for the "get" thread to read 0 after the "set" thread has finished? The unlock stores/writes the 1 into main memory (http://java.sun.com/docs/books/jvms/second_edition/html/Threads.doc.html#22253), but the volatile read of the array reference does not necessarily read/load the [0] int into the "get" thread's working memory. Or does it? Does a volatile read provide a "memory barrier" (http://www.javamex.com/tutorials/synchronization_volatile_java_5.shtml)?

    ReplyDelete
  19. Don't bother about my last question. I see now, it's more about memory action reordering, than synchronizing actual copies of memory. There are only reads on the array reference itself and "volatile" does not ensure the order of multiple reads without writes.

    Would it work with an additional "array = array;" in set()?

    Is there another way to define a memory barrier without locking? probably something like Thread.MemoryBarrier() in .NET.

    ReplyDelete
  20. @bruno - array = array would provide a memory barrier, but you couldn't guarantee that the reader thread's code would occur after that assignment happened. Usually, if you use a volatile as flags, you have some latch-like behavior:

    Thread 1:
    data = o;
    done = true;

    Thread 2:
    if (done)
    // use data

    But you can't check the value of array, because it will be the same regardless of whether you have done the assignment or not.

    Annotating a field "volatile" provides a memory barrier without locking, as does using most of the utilities in java.util.concurrent.atomic.

    ReplyDelete
  21. Don't know if you are still reading this but I am assuming that the semantics for object fields are just like the semantics for an array?

    In other words, if I have a class Persom with a field #name that if #name is not volatile then having Person instances volatile will not garauntee visibility of writes to the name field. Is that right?

    ReplyDelete
  22. @Robert - that's right. To get publication guarantees for #name, you need to write a reference to the Person object.

    ReplyDelete
  23. It's beginning to sound like you should never NOT use volatile except for immutable classes or local variables.

    Is there an extra cost with synchronized blocks if the field being accessed is volatile?

    ReplyDelete
  24. @Robert - There is a cost associated with using volatile fields. It prevents common compiler optimizations, and requires hardware-level memory coherence operations (x86 only requires memory coherence operations for writes, so you see a lot of volatile-based data structures optimized for reads).

    ReplyDelete
  25. Hi Jeremy,

    First off, good article, it was an informative read.

    I do have one comment though concerning writing a self reference to the array as a method of getting a volatile write.

    I think that there is a use for such a construct in a situation where an array is being used as a cache. From my understanding of the Java memory model, there is no guarantee that thread A will ever see non-volatile or non-synchronized writes to a field by thread B. So if you want a non-synchronized cache that is backed by an array, you must write the self reference to ensure that updates are propagated (even if their timeliness is not critical).

    Now in this situation it may be clearer to synchronize the read and writes, but there will probably be better performance using the volatile semantics.

    Thoughts?

    ReplyDelete
  26. That is true, although in practice, you will probably see most writes at some point.

    ReplyDelete
  27. Hi, you say "In this case, you can't actually detect that another thread performed the write, because it is writing the same value to the variable."
    But what if a new value was written? It would be actually detected.
    The behaviour is the same using a volatile field.

    What about this pattern (definitely not a best practice)?

    volatile int [] arr = ...;

    // T1 (writer)
    arr[0] = 1;
    arr = arr;

    // T2 (reader)
    int value = arr[0];

    T2 executes a volatile read (arr), so the happens-before relationship guarantees that the element of the array could be considered "volatile".

    Where am I wrong? Thanks for your time!

    ReplyDelete
  28. @anonymous: If the reader sees the updated value, it has no idea whether the volatile write might have happened (i.e., it can read the value 1 after arr[0] = 1, but before arr = arr). This isn't a big problem with reading a scalar, but it would be a big problem if you were relying on the happens-before relationship for something.

    ReplyDelete
  29. Please tell me; is it volatile object array is possible with AtomicReferenceArray or not?

    ReplyDelete
  30. @~SS~ - This is answered in the post.

    ReplyDelete
  31. As per the explanation here thread1 should not be able to read the updated value from boolean[] array but that's not happening here i.e. thread1 reads the updated value and stops.

    Am I missing something here?

    class ThreadHolder {
    boolean[] flag = new boolean[]{true};

    class Thread1 implements Runnable {
    public void run() {
    int x = 0;
    System.out.println("Thread t1 starts");
    while (flag[0]) {
    x++; // do some work
    }
    System.out.println("Thread t1 done");
    }
    }

    class Thread2 implements Runnable {
    public void run() {
    System.out.println("Thread t2 starts");
    flag[0]=false;
    System.out.println("Thread t2 done");
    }
    }
    }

    ReplyDelete
  32. @Sachin: Yes. The absence of a guarantee does not mean the guarantee of absence.

    To put it another way, just because you aren't guaranteed to see the updated value, that doesn't mean you won't. It just means you there's no guarantee. You may or may not see it.

    ReplyDelete
  33. Hello Jeremy,

    thanks for your post!

    So if we use arr = arr, we are guaranteed to see the value that was written, but we might not get the happens-before (and see stale fields of the written object). Is it correct?

    I'm going through "Concurrency in Practice" and I tried to implement Game of Life using multiple threads and CyclicBarrier. Since each thread modifies its own part of the array I don't want to lock the whole array. But then when we print each generation (using barrier action) we might see stale values. So your idea of arr = arr might be useful here.

    Regards,
    Konstantin.

    ReplyDelete
  34. No, volatile is volatile. The question is really: how do you know that you've executed the arr = arr statement and can rely on the volatile guarantee?

    More to the point, just use AtomicReferenceArray. Not sure why I didn't mention it here.

    ReplyDelete
  35. Nice post, thanks for sharing...

    ReplyDelete
  36. Hi Jeremy,
    I understand that Thread2 sees the change of the variable ready. But why does it also see the change of variable answer, although it isnt volatile.

    ReplyDelete
  37. @anonymous Because all writes of normal variables that happen before a write of a volatile are deemed also to have happened before a read of that value from that volatile, even if it is in another thread. The write and subsequent read create an ordering between the two. Those are the rules, and Java implementations have to find ways to follow them.

    ReplyDelete
  38. Writer thread:
    arr[0] = 1;
    arr = arr;

    Reader thread:
    int one = arr[0];

    Let’s assume that a read always goes after arr = arr; has been executed.
    arr = arr; happens before (happens before relationship) arr (reading arr reference) from the reader thread. Therefore arr[0] = 1; is also visible from the reader thread and one == 1. Am I right with my reasoning?

    ReplyDelete
  39. @anonymous: sure, but how can you tell whether the arr = arr line has executed? arr will have the same value either way.

    ReplyDelete
  40. volatile int[] array = new int[] {0};
    public synchronized void set() {
    array[0] = 1;
    }
    public int get() {
    return array[0];
    }

    Is it possible for the "get" thread to read 0 after the "set" thread has finished?
    I am trying to understand whether happens before relationship means just a flush to the main memory and reordering restrictions or it is something more complicated. Happens before relationship is described in the documentation but I was not able to find any information about the main memory reads and flushes in the documentation. If it is impossible to read 0 then is it correct to think that everything before unlock flushes to the main memory and everything after lock is read from the main memory and in a similar fashion with volatile.

    ReplyDelete
  41. Common mistake - there is no volatile write in set(). There's a volatile read of the array variable, followed by a normal write of 1 to element 0. So, there is no happens-before relationship between the set and the get.

    If you want arrays whose elements can be read and written atomically, use AtomicXArrays from java.util.concurrent.atomic.

    HB relationships are often implemented with memory barriers (the compiler also avoids reordering across those barriers). The spec describes the behavior the JVM needs to observe, it doesn't dictate implementation choice. Precisely how it is implemented depends on architecture, compiler, and various other factors. For example, if the JVM can detect that the variable is never read by another thread, then it can avoid memory barrier operations entirely.

    If you are interested in how processors might implement this, see https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html. That doc is about C++, but volatiles are basically C++ atomics with seq cst semantics.



    ReplyDelete