Skip to main content

A Note on the Thread (Un)safety of Format Classes

I recently got a note on another blog post asking this question:

I have a general question on the thread safety and this is not directly related with your blog. I would appreciate if you could post it on your blog. I have a class that has only one static method that accepts two string parameters and does some operation locally (as shown below). Do I need to synchronize this method?

public class A {
  public static boolean getDate(String str, String str2){
    SimpleDateFormat formatter =
      new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");
    boolean isBefore = false;
    Date date1;
    Date date2;
    try {
      date1 = formatter.parse(str);
      date2 = formatter.parse(str);
      isBefore = date1.after(date2);
    } catch (Exception e) {
      e.getMessage();
    }
    return isBefore;
  }
}


Since it had nothing to do with the blog post in question, I'll answer it here.

The questioner is clearly asking about the well-documented lack of thread safety of the Format classes. This is a long-standing bug at Sun. If you create a Format object (or a MessageFormat, NumberFormat, DecimalFormat, ChoiceFormat, DateFormat or SimpleDateFormat object), it cannot be shared among threads. The above code does not share its SimpleDateFormat object among threads, so it is safe.

If the formatter field had been a static field of the class, the method would not be thread-safe. However, this way, you have to create a (relatively) expensive SimpleDateFormat object every time you invoke the method. There are many ways around this. One answer (if you want to avoid locking) is to use a ThreadLocal:

private static final ThreadLocal formatters =
new ThreadLocal() {
@Override public SimpleDateFormat initialValue() {
return new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS");
}
};
public static boolean getDate(String str, String str2) {
SimpleDateFormat formatter = formatters.get();
...
The even better answer is to use the thread-safe Joda time libraries, instead. I hate to recommend using a non-standard library when there is an equivalent standard one, but this Format behavior is pretty broken.

Comments

CARFIELD said…
for just this example, the formatter created everytime this method executed. Are we really need to put it to threadlocal??

Of course, making formatter global and put in threadlocal run a lot faster.
afsina said…
That just bit my ass yesterday. i was making piece of the code working with multi threading, and i started to get sporadic "array Index out of bound exceptions" from Date formatter class. What a piece of turd IBM injected to the JDK.

Silly me, i thought i would be ok just making the calls synchronized. Now i switched to Joda time, problem solved.
i so wish that Jsr-310 can make it to JDk7 but my hopes are not so high.
How do we know that a class is not Thread safe? Is immutability the only test or are there any other key factors to look for (I am assuming that the class documentation is not mentioning anything about thread safety).
Jeremy Manson said…
@Sumanth - in this case, you read the JavaDoc carefully. Ideally, that would be enough. If the JavaDoc doesn't make it clear, you probably have to go with the usual suspects: testing, reading the code, asking someone else who might know, and so on.

Sorry I can't be more helpful. In an ideal world, everyone would document API thread-safety.
sander hautvast said…
jeremy, your example, though correct is incomlpete. Check out http://www.javaspecialists.eu/archive/Issue172.html . But I agree, joda-time is way simpler
Anonymous said…
I think the real issue is that Sun did not clearly explain that the Format classes were based on the Factory Pattern, so ARE mutable, thus NOT thread-safe!

I've been using ThreadLocal since I read THE Java Currency book, so get used to it!

A Professional Java Developer
Unknown said…
Will threadlocal code still prove to be better performing as compared to the original posted code if one thread invokes new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSS") only once in the lifetime. Also how to deal with threadlocal if instead of "yyyy-MM-dd HH:mm:ss.SSS" you want to pass it as a variable.
Jeremy Manson said…
@IIT: If you are only using SimpleDateFormat in a single thread, then constructing it in that thread is fine.

As for initializing with a variable, I can think of several solutions, but it really depends on your code.
Anonymous said…
Wouldn't it be much easier just to wrap SimpleDateFormat in a threadsafe enclosure:

new DateFormat() {
private SimpleDateFormat _df =..;
public synchronized String format(...) {
return _df.format(..);
}
..
}

If the cost of building/rebuilding is too high.. then maybe the cost of synchronizing isn't.

Of course you'll lose concurrency..
Jeremy Manson said…
@Anonymous - yes, that would be easier, but as you say, it is less concurrent. You really don't want to have a lock hidden away in your program that a lot of threads will be banging away on, if you can avoid it.
monn said…
For this particular case, isn't easier just to return str.compareTo(str2) < 0?
Jeremy Manson said…
@monn - That's not generally a great idea. What if someone else changed the date format slightly, so that that didn't work, but didn't notice the fact that you had used compareTo() to compare them?

When you have a typesafe way of doing something, relying on clever tricks isn't always the best way to go.
Unknown said…
This comment has been removed by the author.

Popular posts from this blog

Double Checked Locking

I still get a lot of questions about whether double-checked locking works in Java, and I should probably post something to clear it up. And I'll plug Josh Bloch's new book, too. Double Checked Locking is this idiom: // Broken -- Do Not Use! class Foo {   private Helper helper = null;   public Helper getHelper() {     if (helper == null) {       synchronized(this) {         if (helper == null) {           helper = new Helper();         }       }     }   return helper; } The point of this code is to avoid synchronization when the object has already been constructed. This code doesn't work in Java. The basic principle is that compiler transformations (this includes the JIT, which is the optimizer that the JVM uses) can change the code around so that the code in the Helper constructor occurs after the write to the helper variable. If it does this, then after the constructing thread writes to helper, but before it actually finishes constructing the object,

What Volatile Means in Java

Today, I'm going to talk about what volatile means in Java. I've sort-of covered this in other posts, such as my posting on the ++ operator , my post on double-checked locking and the like, but I've never really addressed it directly. First, you have to understand a little something about the Java memory model. I've struggled a bit over the years to explain it briefly and well. As of today, the best way I can think of to describe it is if you imagine it this way: Each thread in Java takes place in a separate memory space (this is clearly untrue, so bear with me on this one). You need to use special mechanisms to guarantee that communication happens between these threads, as you would on a message passing system. Memory writes that happen in one thread can "leak through" and be seen by another thread, but this is by no means guaranteed. Without explicit communication, you can't guarantee which writes get seen by other threads, or even the order in whic

Date-Race-Ful Lazy Initialization for Performance

I was asked a question about benign data races in Java this week, so I thought I would take the opportunity to discuss one of the (only) approved patterns for benign races. So, at the risk of encouraging bad behavior (don't use data races in your code!), I will discuss the canonical example of "benign races for performance improvement". Also, I'll put in another plug for Josh Bloch's new revision of Effective Java (lgt amazon) , which I continue to recommend. As a reminder, basically, a data race is when you have one (or more) writes, and potentially some reads; they are all to the same memory location; they can happen at the same time; and that there is nothing in the program to prevent it. This is different from a race condition , which is when you just don't know the order in which two actions are going to occur. I've put more discussion of what a data race actually is at the bottom of this post. A lot of people think that it is okay to have a data