Monday, August 18, 2008

Top 3 Reasons Why Constructors are Worthless

In my last post, I made an offhand reference to the fact that object constructors are worthless in Java. I was asked why this is, so I thought I'd fill in the details.

This topic is a little more well-understood by the developer community than some of the concurrency tidbits that I usually discuss. People who use dependency injection toolkits like Guice and Spring's inversion of control tend to have very strong feelings on the advanced suckitude of constructors. I know quite a few developers who claim that use of Guice changed the way they code (for the better).

ETA: I've had a few people tell me just to use SPI. The point of this post is not that DI is wonderful, it is that constructors are awful. SPI doesn't stop constructors from being awful. In fact, it is yet another horrible solution, what with all of those Foo and FooImpl classes. If you like SPI, go ahead and use it. Furthermore, neither DI nor SPI addresses two of the three issues I mentioned.

So, without further ado:
  1. They are the one kind of instance method that can't be overridden. This is the major pain point for most people using constructors. For testing purposes, programmers like to create mock objects, which duplicate the functionality of the objects you are not testing. They are usually created because the objects they represent aren't easy to construct properly in a unit test, either because they are non-deterministic, or require interaction with a client, or are simply too complex to create easily.

    One of the major difficulties of creating mock instances in Java is that the way we create objects is to call the constructor:
    GiantNetworkedDatabaseConnector connector = 
    new GiantNetworkedDatabaseConnector(); // connects to the giant networked database.

    You don't really want to connect to the giant networked database when you are unittesting your code. The typical approach is to abstract the method into a factory:
    GiantNetworkedDatabaseConnector connector = 
    GiantNetworkedDatabaseConnector.newConnector();
    The newConnector code returns an object. But the method is a static one, which means that you can't replace that at runtime, either. You actually need a static field to hold the factory that makes the connector:
    GiantNetworkedDatabaseConnector connector = 
    GiantNetworkedDatabaseConnector.getFactory().newConnector();
    and then you need to set the factory in some other part of the program:
    GiantNetworkedDatabaseConnector.setFactory(mockFactory);
    and then you need to hope and pray that there aren't multiple threads trying to inject their own factories.

    Getting around this is, of course, the stock-in-trade of the dependency injection frameworks mentioned above. Guice and Spring tend to do what they do very well. It is just completely awful that they have to do it.

    In short: the constructor is just like a static method. You can't override it properly. You can't test it properly. Constructors bring pain.

  2. All immutable state must be set up front. I spent the last few blog entries describing how and why immutable state is good. But all immutable state must be set in the constructor. The degree to which this is terrible doesn't necessarily occur to people who are used to programming in Java. You just have to stop, take a step back, and realize that having separate StringBuilder and String classes is violently counterintuitive.

    Why do these two classes exist? Because there are actually two parts to String creation:

    1. Generate the String, and

    2. Publish it.

    The StringBuilder class exists solely because there is no way to have a clean break between step 1 and step 2. You need a way to say "I will be constructing this object for a while, and then I will be done." For those familiar with Ruby, you are probably thinking of the "freeze" operation at this point, which makes any mutable object into an immutable one. That is closer to what I am imagining, but the hard part of that is enforcing the fact that you call it.

    This problem isn't limited to Strings, of course. A minute's hard thought will make you think of a dozen other possible uses. Code bases tend to have lots of other Builder classes; I probably write a couple a month.

    In short: immutable objects can't be constructed properly when you can't just shove all of the built state into the constructor. Constructors bring pain.

  3. They make custom serialization and deserialization more painful. I'm not sure that I can do this topic justice in a single blog post. For a good overview of why serialization is awful, go purchase and read Effective Java.

    A brief precis: when you do deserialization, you need an instance of the object so that you can deserialize into it. The way this works is that the default constructor (the one with no arguments) is invoked for the non-serializable classes. This means that ETA: I phrased this incorrectly here, so I'm editing it. a) you need a visible no-arg constructor if something ever might get serialized (which is hard to predict), and b) that constructor might get called.

    That's just for the default mechanism, of course. For anything that requires custom deserialization (i.e., anything non-trivial), you have to use the the magic readObject method, which is not a constructor. That method, of course, doesn't have the semantics of constructors, so can't set final fields (as discussed in my post on deserialization and reflection).

    Serialization is, of course, in general, completely awful. It is no surprise that people turned to XML. At Google, we use a more compact system called protocol buffers, which are still rather hacktackular, but definitely better than serialization. Serialization is awful, and constructors make it worse. Constructors bring pain.

At one level removed, a big part of the problem is that object construction and initialization are tied together. Imagine if you could construct an object with all of its fields set to null or zero, and then call as many initializer methods as you wanted:
  1. Client code could create the object and could override the initializer methods, thereby neatly avoiding the problem of not being able to override the initializer. Even better, you could specify the type you wanted created at runtime:
    GiantNetworkedDatabaseConnector buildConnector(
    Class<? extends GiantNetworkedDatabaseConnector> c) {
    // This is impossible with standard Java bytecode.
    return new c.init();
    }

    // MockGiantNetworkedDatabaseConnector overrides the init() method
    GiantNetworkedDatabaseConnector c =
    buildConnector(MockGiantNetworkedDatabaseConnector.class);
  2. You could call initializer methods, and have them set immutable state (possibly using the notion of freeze to make an object immutable):
    String s = new String.append("foo").append("bar");
    assert s.equals("foobar");
    freeze s;

  3. You wouldn't have to worry about serialization calling the no-argument constructor, because the no-argument constructor wouldn't do anything. readObject could have initializer semantics, and be able to set final fields.
The point isn't really the solution: this solution is broken because the object doesn't always maintain its invariants (until the init() methods are done being called). You can continue to brainstorm to fix this, and come up with as many straw man approaches as you want, but the fact remains that Java programmers are still going to be stuck with constructors.

There isn't much of a moral to this story, other than that constructors bring pain.

34 comments:

Neil said...

There's no need to use a dependency injection framework to get around the static method on factory.

Dependency injection is simply good design. There's no reason why the database connection factory method can't be an instance method on an interface and an instance of that object can just be "passed in from the top" for the places where its needed.

Torsten Curdt said...

Sorry, totally disagree. Constructors and finals - love 'em! This all comes down to how you design the class. Opening a database connection in a constructor. Oh my! What you want to do to make such complex classes testable is to pass in an environment abstraction.

Casper Bang said...

Sigh, another blog post screaming "DI". If you have something to mock, just use a SPI approach (you write to an interface right?) supported directly in the language and avoid pulling in dependencies and polluting the code with unnecessary layers of indirection. Let the class path decide, it composes beautifully with the use of Ant or Maven.

Now what we really need are object initializers, a way to encapsulate instantiation and initialization without having to pollute setters with "return this". C# supports this beautifully and it composes very well.

tackline said...

Serialisable classes do not need a no-args constructor, and if they have one it is not called. What is necessary and is called is the most derived, non-Serializable subclass no-args constructor. But yeah, the serialisation mechanism could do with some work.

Jeremy Manson said...

Sigh, another blog post screaming "DI". If you have something to mock, just use a SPI approach (you write to an interface right?) supported directly in the language and avoid pulling in dependencies and polluting the code with unnecessary layers of indirection.

So, it turns out that the point of this post was not to push DI, it was to point out that constructors are a pain in the rear. DI has its problems, too. For example, the code becomes hideous.

I'm not a big fan of SPI, because you end up with a million Foo interfaces and a million FooImpl classes. It becomes very difficult to navigate the code, and people tend to give up on the FooImpl classes after a while. DI has some of this problem, too.

As I said at the bottom of the post, you can come up with as many straw man fixes as you like. Mostly, the fixes have problems.

Now what we really need are object initializers, a way to encapsulate instantiation and initialization without having to pollute setters with "return this". C# supports this beautifully and it composes very well.

Not having looked at C# much since it came out, how does this differ from my point about having initializer methods?

Jeremy Manson said...

Serialisable classes do not need a no-args constructor, and if they have one it is not called. What is necessary and is called is the most derived, non-Serializable subclass no-args constructor. But yeah, the serialisation mechanism could do with some work.

Sorry, yeah, it was late when I wrote this; much of it came out badly. Fixed.

Jeremy Manson said...

Sorry, totally disagree. Constructors and finals - love 'em!

In spite of the fact that you need Builder classes everywhere if you want real immutability for final fields? Or do you disagree that we actually need to do this?

Jeremy Manson said...


I'm not a big fan of SPI, because you end up with a million Foo interfaces and a million FooImpl classes. It becomes very difficult to navigate the code, and people tend to give up on the FooImpl classes after a while. DI has some of this problem, too.


Another issue is that DI works even if the code you are testing isn't written specifically to support it. SPI only works if the code supports SPI.

In either case, it doesn't invalidate the point that constructors bring pain -- it merely illustrates that there are multiple ways to get around that.

Casper Bang said...

"Not having looked at C# much since it came out, how does this differ from my point about having initializer methods?"

Because method chaining and having to return "this" explicitly in your API requires foresight which we often just don't have. I.e. it won't help us much in dealing with the existing 17.000 classes in the JRE.

Since C# supports the notion of properties in the language rather than as a is/get/set naming pattern, there's syntactical support for initialization when you instantiate:

Credentials o = new Cretentials{
Username="michael",
Password="opensesame"};

It does not suffer from the readability and usability issue we have today with constructor overloading.

Jeremy Manson said...


Because method chaining and having to return "this" explicitly in your API requires foresight which we often just don't have. I.e. it won't help us much in dealing with the existing 17.000 classes in the JRE.


Ah. You could, as some have proposed, change void methods to return "this" by default. Your way is probably better.

Brian said...

Interestingly, what you propose about decoupled allocation and initialization is exactly how Objective-C gets around all this nastiness. When you create a new object of type Foo, you say:

[[Foo alloc] init];

Since the init method is just an instance method of your class, it can be overridden like any other method, and -- best of all -- its return value is explicit and polymorphic, so you can return any object you like (even, say, a cached one), even an instance of a subclass of Foo. It's similar to the "new Foo().init()" pattern you used in your post, but it's better integrated into the language. I really wish more languages did things this way.

Tim said...

freeze[...] That is closer to what I am imagining, but the hard part of that is enforcing the fact that you call it.

Enforcing that is not even the main problem. The main problem is that with freeze() your object has two very different states. Some operations are only possible in the first state, before the freeze(). Setting properties, for example. Many other operations are only possible after the initialization (when all properties have been set).

And then you have two alternatives:
1. The class implementor must check in every method that the object has the right state, and throw an exception otherwise. The class user must think about the object's state every time she writes code that invokes the method. This is basically the duck typing strategy (ok, for real duck typing, don't throw an exception, but do something undefined instead and hope that the unit tests catch the bug).

2. You add the notion of object states to the language. Each method implementation must declare for which states it is defined. And all references (in method arguments etc) must also declare the state(s) that the object may have. That's the statically typed approach to solve that problem.

These are basically the two options that you have.

Jeremy Manson said...


Enforcing that is not even the main problem. The main problem is that with freeze() your object has two very different states. Some operations are only possible in the first state, before the freeze(). Setting properties, for example. Many other operations are only possible after the initialization (when all properties have been set).


That's right, and a very good point. This is one of those things that went through my mind when I was writing the post, but I neglected to mention.

tinou said...

so what's the consensus, are constructors worthless?

Jeremy Manson said...

so what's the consensus, are constructors worthless?

Ha! Well, now that you put it that way, I suppose that they can't be. Let's face it, we don't have a lot of other options.

Otherwise, I'm pretty sure that they have substantial problems. :)

Dimitris Andreou said...

>They are the one kind of instance method that can't be overridden.

Actually, constructors are static methods. People ofter confuse them with instance methods due to the particular syntax (i.e. absence of explicit "static" keyword in declaration).

jcobbers said...

jeremy, you sound like Misko, another googler. (http://misko.hevery.com/) Have you two met, I think you'd have much to agree upon.

Here was a recent post of his (also on google developer blog) http://misko.hevery.com/2008/07/08/how-to-think-about-the-new-operator/

Peter Lawrey said...

On point 3, I don't find constructors a problem for custom serialization because I ignore them and use Unsafe.allocateInstance().
Still this adds to your point that constructors aren't useful for deserialization.

Chris Kentfield said...

This is an interesting discussion. I rarely see code that uses an instance initializer in an anonymous class, but you could do something like this to initialize your new object:

Customer john = new Customer() {{
setFirstName("John");
setLastName("Smith");
}};

Unfortunately you can't modify finals there, but you could use a protected setter to keep your property read only.

Michael Galpin said...

We have a large UI component library where each component has a data model. For each data model we have an interface and we always insist on the interface providing a sample implementation for testing. Something like

interface WidgetModel{
String getName();
Integer getAge();
static WidgetModel SAMPLE = new WidgetModel() {
public String getName() { return "Mike";}
public Integer getAge() { return 34;}
};
}

Obviously SAMPLE is just a naming convention and cannot be enforced at compile time. We have thought of using Annotations to enforce things via apt...

So I don't totally agree with your post, as point #1 can be mitigated and #2 and #3 aren't really problems with constructors. Immutability is generally restricted by the lack of expressiveness in Java (take a look at Scala for a good comparison) and serialization is a utility hack that opens up all kinds of issues.

However, it would be really nice to enforce (and maybe restrict) a constructor signature through an interface. Like in the interface above, it would be nice to make name required but age optional by stating that any implementation of the interface must include name as part of its signature.

Steven Shaw said...

There's definitely something wrong with ctors. They cannot be worthless though because how else would you create your objects?

Your suggestion of init methods can be done effectively now with no-arg ctors and init/setter methods. Of course, you don't get your final fields but you can't always have everything :).

I've thought about the whole ctors + serialisation/deserialisation thing before without coming up with anything satisfactory (even after much Googling). We need the services here of a Programming Languages Professor... The closest I came up with was http://gbracha.blogspot.com/2007/06/constructors-considered-harmful.html

I would add to your list that ctors cannot be inherited. If you subclass a class with many "convenience" ctors (with various parameters), you have to provide implementations for each of the ctors in your subclass which delegate to super. It's nasty.

Because of these factors I tend to favor setter injection with no-arg contructors, setter methods and non-final fields. I can see where the ctor-injection nazis are coming from but I don't feel it's the most pragmatic solution.

Misko said...

Hi Jeremy,

Very much like reading what you had to say about the constructors and I totally agree. I have written something similar, how important is to think about the "new" http://misko.hevery.com/2008/07/08/how-to-think-about-the-new-operator/

Misko

Jeremy Manson said...

@dmitris Actually, they are neither static nor instance, in the eyes of the JVM specification. They are "special". Isn't that nice?

@peter There's a difference between "not a problem" and "I use a non-standard hack to work around this".

@michael: You don't see this solution as a problem? It is not lovely to override all of the methods of a given class whenever you want to mock it. If nothing else, you need a separate Foo and FooImpl class for everything you might want to mock. I find that makes navigating code in an IDE horrible.

I am aware that immutability in Java and serialization are not beautiful, but that doesn't mean that they don't exist and shouldn't be used. In fact, immutability is pretty much necessary. You can't really say that constructors aren't a problem because there are problems with other parts of the language.

@steven You can't really use no-arg constructors to solve these problems, because you can't replace the call to new Foo(). You need a factory method to provide the new Foo(), and the factory method needs to be on an instance which can be mocked out. And then you need to hold your breath and jump through a few hoops.

Gilad does have some good thoughts on this. I think the real solution is not to use Java.

@misko Had I seen your blog first, I would have definitely have pointed people there!

Anonymous said...

very few things in programming are absolute

that being said:
'properties' are evil
C# is evil

Anonymous said...

Interesting post. I agree constructors suck, but finding the right way to do constructors is insanely hard.

We've struggled with this problem while designing Fan and have yet to come up with a solution.

- Brian

pveentjer said...

Personally I prefer to use constructors as well (although default arguments and named arguments would be a welcome improvement).

With constructors you can use final fields, beside preventing JMM issues, it also helps me to understand a class better and to prevent stupid mistakes.

And constructors help me to maintain class invariants. I look at a lot of enterprise Java/Spring code and I see a lot of classes that don't have constructors but use setter injection. Often it doesn't really matter (for example injecting a EmployeeDao in an EmployeeService) because it is 'clear' that the EmployeeDao is never going to be updated.

But when classes gets more complex and less standard, it is hard to figure out if a method only should be used for construction purposes or can be used after construction. This gets worse because a lot of 'agile' developers don't like to write Javadoc or usable Javadoc.

Since there are no better language constructors to solve these issues, constructors are reasonable solution.

ps:
Nice also has dropped constructors from the language, this was one of the things I really disliked about the language..

Robert Konigsberg said...

Something I just realized doing someone's code review:

One thing you can unambiguously guarantee with 'new' that you can't with factories is that the instance is not null.

Jeremy Manson said...


Since there are no better language constructors to solve these issues, constructors are reasonable solution.

Yes, the truth of this post is not that we can move away from constructors to something else tomorrow. It is that they aren't the ideal solution.

Robin Bygrave said...

Found this 'proposal' for effectively final fields ...

http://negev.wordpress.com/java-memory-brief/

The way I see it there are cases where you can't make some fields final and you want them to be... (for me it was calculating a fields values when circular relationships exist)... For this it would have been good to have support for effectively final fields.

Jeremy Manson said...

@Robin - I'm starting to think that safe publication of fields that are not final would be a useful thing to have.

Anonymous said...

Q - Is it possible for Guice to create a thread safe singleton which injects its members without constructors?

Anonymous said...

A - Yes, Guice under the hood uses DCL (double checked locking) and the volatile reference guarantees ordering.

Illya said...

Sorry, but have to disagree with you on point (1) Mocking. Have you looked into [Easy]PowerMock framework (http://code.google.com/p/powermock/), which pretty much removes any boundaries when it comes to mock object creation and allows to mock any (including static and final) methods, variables including choice of the constructors during mock creation. I highly recommend to check it out since it seems to void your point on Constructors and Mocking.
I have no opinion on (2) since I used to program in Java and unaware of a problem ;).
I have to agree with (3) Serialization and Constructor is a really difficult point, however, I tend to think that Serialization frameworks can be improved to utilize proper use of constructors during object creation instead of skipping them all together.

Teve said...

"Top Three Reasons Why Constructors are Worthless"

Its an interesting observation. I'll explain this in a roundabout way.

I'll begin with the assumption that there is an element of the human condition that makes us seek some ubiquitous notion of truth. From my observations it appears that a limpet like addiction for this "truth" often serves to undermine us - both as individuals and as a society.

I tend to call this blind obsession for some arbitrary definition of truth "fact obsession".

I've observed this "fact obsession" in many places - from explanations of physics and economics where the blurring of "model" and "reality" seems commonplace, to belief systems such as the delightfully documented "cargo cult".

I assume this fact obsession - a need for a belief in "truth" - seems to result in people making assumptions presented as facts in areas they may be unaware, based upon almost unconscious extrapolations and simplifications.

Which brings me back to the observation on the statement "Top Three Reasons Why Constructors are Worthless" - of which I wouldn't normally comment, but am in this instance as the blog post seems to support, rather than qualify this statement.

The factual presentation of constructors being worthless without limit to domain or purpose seems to suggest that they are without merit.

Is this really the case?

Could it be considered that for some purposes that constructors are of some benefit, and there may be some instances where the author has observed where alternative strategies may be more appropriate?