In my last post, I made an offhand reference to the fact that object constructors are worthless in Java. I was asked why this is, so I thought I'd fill in the details.
This topic is a little more well-understood by the developer community than some of the concurrency tidbits that I usually discuss. People who use dependency injection toolkits like Guice and Spring's inversion of control tend to have very strong feelings on the advanced suckitude of constructors. I know quite a few developers who claim that use of Guice changed the way they code (for the better).
ETA: I've had a few people tell me just to use SPI. The point of this post is not that DI is wonderful, it is that constructors are awful. SPI doesn't stop constructors from being awful. In fact, it is yet another horrible solution, what with all of those Foo and FooImpl classes. If you like SPI, go ahead and use it. Furthermore, neither DI nor SPI addresses two of the three issues I mentioned.
So, without further ado:
At one level removed, a big part of the problem is that object construction and initialization are tied together. Imagine if you could construct an object with all of its fields set to null or zero, and then call as many initializer methods as you wanted:
There isn't much of a moral to this story, other than that constructors bring pain.
This topic is a little more well-understood by the developer community than some of the concurrency tidbits that I usually discuss. People who use dependency injection toolkits like Guice and Spring's inversion of control tend to have very strong feelings on the advanced suckitude of constructors. I know quite a few developers who claim that use of Guice changed the way they code (for the better).
ETA: I've had a few people tell me just to use SPI. The point of this post is not that DI is wonderful, it is that constructors are awful. SPI doesn't stop constructors from being awful. In fact, it is yet another horrible solution, what with all of those Foo and FooImpl classes. If you like SPI, go ahead and use it. Furthermore, neither DI nor SPI addresses two of the three issues I mentioned.
So, without further ado:
- They are the one kind of instance method that can't be overridden. This is the major pain point for most people using constructors. For testing purposes, programmers like to create mock objects, which duplicate the functionality of the objects you are not testing. They are usually created because the objects they represent aren't easy to construct properly in a unit test, either because they are non-deterministic, or require interaction with a client, or are simply too complex to create easily.
One of the major difficulties of creating mock instances in Java is that the way we create objects is to call the constructor:GiantNetworkedDatabaseConnector connector =
new GiantNetworkedDatabaseConnector(); // connects to the giant networked database.
You don't really want to connect to the giant networked database when you are unittesting your code. The typical approach is to abstract the method into a factory:GiantNetworkedDatabaseConnector connector =
The newConnector code returns an object. But the method is a static one, which means that you can't replace that at runtime, either. You actually need a static field to hold the factory that makes the connector:
GiantNetworkedDatabaseConnector.newConnector();GiantNetworkedDatabaseConnector connector =
and then you need to set the factory in some other part of the program:
GiantNetworkedDatabaseConnector.getFactory().newConnector();GiantNetworkedDatabaseConnector.setFactory(mockFactory);
and then you need to hope and pray that there aren't multiple threads trying to inject their own factories.
Getting around this is, of course, the stock-in-trade of the dependency injection frameworks mentioned above. Guice and Spring tend to do what they do very well. It is just completely awful that they have to do it.
In short: the constructor is just like a static method. You can't override it properly. You can't test it properly. Constructors bring pain. - All immutable state must be set up front. I spent the last few blog entries describing how and why immutable state is good. But all immutable state must be set in the constructor. The degree to which this is terrible doesn't necessarily occur to people who are used to programming in Java. You just have to stop, take a step back, and realize that having separate StringBuilder and String classes is violently counterintuitive.
Why do these two classes exist? Because there are actually two parts to String creation:- Generate the String, and
- Publish it.
This problem isn't limited to Strings, of course. A minute's hard thought will make you think of a dozen other possible uses. Code bases tend to have lots of other Builder classes; I probably write a couple a month.
In short: immutable objects can't be constructed properly when you can't just shove all of the built state into the constructor. Constructors bring pain. - They make custom serialization and deserialization more painful. I'm not sure that I can do this topic justice in a single blog post. For a good overview of why serialization is awful, go purchase and read Effective Java.
A brief precis: when you do deserialization, you need an instance of the object so that you can deserialize into it. The way this works is that the default constructor (the one with no arguments) is invoked for the non-serializable classes. This means that ETA: I phrased this incorrectly here, so I'm editing it. a) you need a visible no-arg constructor if something ever might get serialized (which is hard to predict), and b) that constructor might get called.
That's just for the default mechanism, of course. For anything that requires custom deserialization (i.e., anything non-trivial), you have to use the the magic readObject method, which is not a constructor. That method, of course, doesn't have the semantics of constructors, so can't set final fields (as discussed in my post on deserialization and reflection).
Serialization is, of course, in general, completely awful. It is no surprise that people turned to XML. At Google, we use a more compact system called protocol buffers, which are still rather hacktackular, but definitely better than serialization. Serialization is awful, and constructors make it worse. Constructors bring pain.
At one level removed, a big part of the problem is that object construction and initialization are tied together. Imagine if you could construct an object with all of its fields set to null or zero, and then call as many initializer methods as you wanted:
- Client code could create the object and could override the initializer methods, thereby neatly avoiding the problem of not being able to override the initializer. Even better, you could specify the type you wanted created at runtime:
GiantNetworkedDatabaseConnector buildConnector(
Class<? extends GiantNetworkedDatabaseConnector> c) {
// This is impossible with standard Java bytecode.
return new c.init();
}
// MockGiantNetworkedDatabaseConnector overrides the init() method
GiantNetworkedDatabaseConnector c =
buildConnector(MockGiantNetworkedDatabaseConnector.class); - You could call initializer methods, and have them set immutable state (possibly using the notion of freeze to make an object immutable):
String s = new String.append("foo").append("bar");
assert s.equals("foobar");
freeze s; - You wouldn't have to worry about serialization calling the no-argument constructor, because the no-argument constructor wouldn't do anything. readObject could have initializer semantics, and be able to set final fields.
There isn't much of a moral to this story, other than that constructors bring pain.
Comments
Dependency injection is simply good design. There's no reason why the database connection factory method can't be an instance method on an interface and an instance of that object can just be "passed in from the top" for the places where its needed.
Now what we really need are object initializers, a way to encapsulate instantiation and initialization without having to pollute setters with "return this". C# supports this beautifully and it composes very well.
So, it turns out that the point of this post was not to push DI, it was to point out that constructors are a pain in the rear. DI has its problems, too. For example, the code becomes hideous.
I'm not a big fan of SPI, because you end up with a million Foo interfaces and a million FooImpl classes. It becomes very difficult to navigate the code, and people tend to give up on the FooImpl classes after a while. DI has some of this problem, too.
As I said at the bottom of the post, you can come up with as many straw man fixes as you like. Mostly, the fixes have problems.
Now what we really need are object initializers, a way to encapsulate instantiation and initialization without having to pollute setters with "return this". C# supports this beautifully and it composes very well.
Not having looked at C# much since it came out, how does this differ from my point about having initializer methods?
Sorry, yeah, it was late when I wrote this; much of it came out badly. Fixed.
In spite of the fact that you need Builder classes everywhere if you want real immutability for final fields? Or do you disagree that we actually need to do this?
I'm not a big fan of SPI, because you end up with a million Foo interfaces and a million FooImpl classes. It becomes very difficult to navigate the code, and people tend to give up on the FooImpl classes after a while. DI has some of this problem, too.
Another issue is that DI works even if the code you are testing isn't written specifically to support it. SPI only works if the code supports SPI.
In either case, it doesn't invalidate the point that constructors bring pain -- it merely illustrates that there are multiple ways to get around that.
Because method chaining and having to return "this" explicitly in your API requires foresight which we often just don't have. I.e. it won't help us much in dealing with the existing 17.000 classes in the JRE.
Since C# supports the notion of properties in the language rather than as a is/get/set naming pattern, there's syntactical support for initialization when you instantiate:
Credentials o = new Cretentials{
Username="michael",
Password="opensesame"};
It does not suffer from the readability and usability issue we have today with constructor overloading.
Because method chaining and having to return "this" explicitly in your API requires foresight which we often just don't have. I.e. it won't help us much in dealing with the existing 17.000 classes in the JRE.
Ah. You could, as some have proposed, change void methods to return "this" by default. Your way is probably better.
[[Foo alloc] init];
Since the init method is just an instance method of your class, it can be overridden like any other method, and -- best of all -- its return value is explicit and polymorphic, so you can return any object you like (even, say, a cached one), even an instance of a subclass of Foo. It's similar to the "new Foo().init()" pattern you used in your post, but it's better integrated into the language. I really wish more languages did things this way.
Enforcing that is not even the main problem. The main problem is that with freeze() your object has two very different states. Some operations are only possible in the first state, before the freeze(). Setting properties, for example. Many other operations are only possible after the initialization (when all properties have been set).
And then you have two alternatives:
1. The class implementor must check in every method that the object has the right state, and throw an exception otherwise. The class user must think about the object's state every time she writes code that invokes the method. This is basically the duck typing strategy (ok, for real duck typing, don't throw an exception, but do something undefined instead and hope that the unit tests catch the bug).
2. You add the notion of object states to the language. Each method implementation must declare for which states it is defined. And all references (in method arguments etc) must also declare the state(s) that the object may have. That's the statically typed approach to solve that problem.
These are basically the two options that you have.
Enforcing that is not even the main problem. The main problem is that with freeze() your object has two very different states. Some operations are only possible in the first state, before the freeze(). Setting properties, for example. Many other operations are only possible after the initialization (when all properties have been set).
That's right, and a very good point. This is one of those things that went through my mind when I was writing the post, but I neglected to mention.
Ha! Well, now that you put it that way, I suppose that they can't be. Let's face it, we don't have a lot of other options.
Otherwise, I'm pretty sure that they have substantial problems. :)
Actually, constructors are static methods. People ofter confuse them with instance methods due to the particular syntax (i.e. absence of explicit "static" keyword in declaration).
Here was a recent post of his (also on google developer blog) http://misko.hevery.com/2008/07/08/how-to-think-about-the-new-operator/
Still this adds to your point that constructors aren't useful for deserialization.
Customer john = new Customer() {{
setFirstName("John");
setLastName("Smith");
}};
Unfortunately you can't modify finals there, but you could use a protected setter to keep your property read only.
interface WidgetModel{
String getName();
Integer getAge();
static WidgetModel SAMPLE = new WidgetModel() {
public String getName() { return "Mike";}
public Integer getAge() { return 34;}
};
}
Obviously SAMPLE is just a naming convention and cannot be enforced at compile time. We have thought of using Annotations to enforce things via apt...
So I don't totally agree with your post, as point #1 can be mitigated and #2 and #3 aren't really problems with constructors. Immutability is generally restricted by the lack of expressiveness in Java (take a look at Scala for a good comparison) and serialization is a utility hack that opens up all kinds of issues.
However, it would be really nice to enforce (and maybe restrict) a constructor signature through an interface. Like in the interface above, it would be nice to make name required but age optional by stating that any implementation of the interface must include name as part of its signature.
Your suggestion of init methods can be done effectively now with no-arg ctors and init/setter methods. Of course, you don't get your final fields but you can't always have everything :).
I've thought about the whole ctors + serialisation/deserialisation thing before without coming up with anything satisfactory (even after much Googling). We need the services here of a Programming Languages Professor... The closest I came up with was http://gbracha.blogspot.com/2007/06/constructors-considered-harmful.html
I would add to your list that ctors cannot be inherited. If you subclass a class with many "convenience" ctors (with various parameters), you have to provide implementations for each of the ctors in your subclass which delegate to super. It's nasty.
Because of these factors I tend to favor setter injection with no-arg contructors, setter methods and non-final fields. I can see where the ctor-injection nazis are coming from but I don't feel it's the most pragmatic solution.
Very much like reading what you had to say about the constructors and I totally agree. I have written something similar, how important is to think about the "new" http://misko.hevery.com/2008/07/08/how-to-think-about-the-new-operator/
Misko
@peter There's a difference between "not a problem" and "I use a non-standard hack to work around this".
@michael: You don't see this solution as a problem? It is not lovely to override all of the methods of a given class whenever you want to mock it. If nothing else, you need a separate Foo and FooImpl class for everything you might want to mock. I find that makes navigating code in an IDE horrible.
I am aware that immutability in Java and serialization are not beautiful, but that doesn't mean that they don't exist and shouldn't be used. In fact, immutability is pretty much necessary. You can't really say that constructors aren't a problem because there are problems with other parts of the language.
@steven You can't really use no-arg constructors to solve these problems, because you can't replace the call to new Foo(). You need a factory method to provide the new Foo(), and the factory method needs to be on an instance which can be mocked out. And then you need to hold your breath and jump through a few hoops.
Gilad does have some good thoughts on this. I think the real solution is not to use Java.
@misko Had I seen your blog first, I would have definitely have pointed people there!
that being said:
'properties' are evil
C# is evil
We've struggled with this problem while designing Fan and have yet to come up with a solution.
- Brian
With constructors you can use final fields, beside preventing JMM issues, it also helps me to understand a class better and to prevent stupid mistakes.
And constructors help me to maintain class invariants. I look at a lot of enterprise Java/Spring code and I see a lot of classes that don't have constructors but use setter injection. Often it doesn't really matter (for example injecting a EmployeeDao in an EmployeeService) because it is 'clear' that the EmployeeDao is never going to be updated.
But when classes gets more complex and less standard, it is hard to figure out if a method only should be used for construction purposes or can be used after construction. This gets worse because a lot of 'agile' developers don't like to write Javadoc or usable Javadoc.
Since there are no better language constructors to solve these issues, constructors are reasonable solution.
ps:
Nice also has dropped constructors from the language, this was one of the things I really disliked about the language..
One thing you can unambiguously guarantee with 'new' that you can't with factories is that the instance is not null.
Since there are no better language constructors to solve these issues, constructors are reasonable solution.
Yes, the truth of this post is not that we can move away from constructors to something else tomorrow. It is that they aren't the ideal solution.
http://negev.wordpress.com/java-memory-brief/
The way I see it there are cases where you can't make some fields final and you want them to be... (for me it was calculating a fields values when circular relationships exist)... For this it would have been good to have support for effectively final fields.
I have no opinion on (2) since I used to program in Java and unaware of a problem ;).
I have to agree with (3) Serialization and Constructor is a really difficult point, however, I tend to think that Serialization frameworks can be improved to utilize proper use of constructors during object creation instead of skipping them all together.
Its an interesting observation. I'll explain this in a roundabout way.
I'll begin with the assumption that there is an element of the human condition that makes us seek some ubiquitous notion of truth. From my observations it appears that a limpet like addiction for this "truth" often serves to undermine us - both as individuals and as a society.
I tend to call this blind obsession for some arbitrary definition of truth "fact obsession".
I've observed this "fact obsession" in many places - from explanations of physics and economics where the blurring of "model" and "reality" seems commonplace, to belief systems such as the delightfully documented "cargo cult".
I assume this fact obsession - a need for a belief in "truth" - seems to result in people making assumptions presented as facts in areas they may be unaware, based upon almost unconscious extrapolations and simplifications.
Which brings me back to the observation on the statement "Top Three Reasons Why Constructors are Worthless" - of which I wouldn't normally comment, but am in this instance as the blog post seems to support, rather than qualify this statement.
The factual presentation of constructors being worthless without limit to domain or purpose seems to suggest that they are without merit.
Is this really the case?
Could it be considered that for some purposes that constructors are of some benefit, and there may be some instances where the author has observed where alternative strategies may be more appropriate?