torsdag, maj 15, 2008

Would Type Inference help Java

My former colleague Lars Westergren recently posted a blog (here) about type inferencing, posing the question whether type inference would actually be good for Java, and if it would provide any benefits outside of just "less typing".

In short: no. Type inferencing would probably not do much more than save you some typing. But how much typing it would save you could definitely vary depending on the type of type inference you added. The one version I would probably prefer is just a very simple hack to avoid writing out the generic type arguments. One simple way of doing that would be to allow an equals sign inside of the angle brackets. In that case you could do this:
List<=>      l  = new ArrayList<String>();
List<String> l2 = new ArrayList<=>();
Of course, you can do it on more complicated expressions:
List<Set<Map<Class<?>, List<String>>>> l = new ArrayList<=>();
This would save us some real pain in the definition of genericized types, and it wouldn't strip away much stuff you need for readability. In the above examples it would just strip away one duplication, and you don't need that duplication to read it correctly. The one case where it might be a little bit harder to read would be if you defined a variable and assigned it somewhere else. In that case the definition would need to carry the type information, so the instantiation would use the <=> syntax. I think that would be an acceptable price to reduce the verbosity of Java generics.

Another kind of generics that would be somewhat useful is the kind added to C#, which is only local to a scope. That means there will be no type inferencing of member variables, method parameters or return values. Of course, that's the crux of Lars question, since this kind of type inference potentially removes ALL type information in the current text, since you can do:
var x = someValue.DoSomething();
At this point there is no easy way for you to know what the type of x actually is. Reading it like this, it looks a bit frightening if you're used to Java type tags, but in fact this is not what you would see. In most cases you have a small method - maybe 5-15 lines of code, where x is being used in some way or another. In many cases you will see methods called on x, or x used as argument to method calls. Both of these usages gives you clues about what it might be, but in fact you don't always need to know what type it is. You just need to know what you can do with it. And that's exactly what Java interfaces represent. So for example, do you know what class you get back from Collections.synchronizedMap()? No, and you shouldn't need to know. What you do know is that it's something that implements Map, and the documentation says that it is synchronized, but that is it. The only thing you know about it is that you can use it as a map.

So in practice, the kind of type inference C# adds is actually quite useful, clean, and doesn't cause too much trouble - especially if you have one of those fancy ideas that do method completion... =)

From another angle, there are some things that type inference could possible do, but that you will never see in Java. For example, say that you assign a variable to something, and later you assign that variable to some other value. If these two values are distinct types that doesn't overlap in the inheritence chain, you will usually get an error. But if you have an advanced type system, it will do unification for you. The basic versions will just find the most common supertype (the disjunction), but you can also imagine the compiler injecting a new type into your program that is the union of the two types in use. This will provide something similar to duck typing while still retaining some static type safety. If your type system allows multiple inheritence, the synthetic union type might even be a subclass of both the types in question.

So yeah. The long answer is that you can actually do some funky stuff with type inference that doesn't immediately translate to less typing. Although less typing and better abstractions is what programming languages are all about, right? Otherwise assembler provides everything we want.

18 kommentarer:

Unknown sa...

Maybe we have a different view of how type inference would work, because I think it would be a welcome addition. More in a bit.

First I want to point out one stupid thing about Java Generics and constructors:

// [] == <> due to blogger comment restrictions...

Map[String,Employee] empMap = new HashMap[String,Employee]();

This is quite possibly the most useless waste of typing that you can possibly do in Java 5. The generic parameters to the right of the assignment are absolutely, 100% useless, are immediately erased and do NOTHING.

When there's a constructor to the right of the assignment, it's just as meaningful to simply type:

Map[String,Employee] empMap = new HashMap();

There's no difference at all, save one. The compiler will toss a warning out, and most IDEs will figure this out in advance and hightlight this as an unsafe assignment (which it is absolutely not). I gather it's because most IDEs and the compiler don't/won't/can't distinguish between:

Map[String,Employee] empMap = new HashMap();

//...VS...

Map[String,Employee] empMap = someUntypedMapLookup.getEmployeeMap();
//where getEmploeyeMap() returns an untyped Map.

In the latter case we might quickly think it's fair for the compiler to highlight and show the warning. But here's the really stupid thing though: The compiler/IDE will warn you that it's an unsafe operation, and casting it here removes the warning. BUT!!! (you know what's coming).... No error is ever thrown here if you are casting an incompatible Map at this point. So why warn me? It's absolutely ridiculous.

Map getMapFullOfCatsAndDogs() {...}

Map[String,Employee] empMap = (Map[String,Employee]) getMapFullOfCatsAndDogs();

Works just fine until you actually attempt to use the map. And even then you could simply downcast it and use it inappropriately (which is actually appropriate to the contained type). The only time Java generics really provide meaningful infomation in Java is if both sides are typed:

Map[String,Pets] getMapFullOfCatsAndDogs() {...}
Map[String,Employee] empMap = (Map[String,Employee]) getMapFullOfCatsAndDogs();

...OR...

Map[String,Employee] empMap = new HashMap[String,Pets]();

That last case makes me want down a bottle of vicodin, because the only reason there's a problem at all is that generics exist. And there's actually no consequence or prolem here save for someone decided to try to do the right thing in Java by typing as much as possible (probably to satisfy the IDE warning), and failed dramatically. It's a self imposed problem!

Anyway, in both cases this does provide meaningful information, but is actually a compiler ERROR, not the same warning as before. I really think the earlier warning and any extra typing related to it is a waste of characters in source code. Currently in Java, there's nothing wrong with:

Map[String,Employee] empMap = new HashMap();

Unless Sun has eliminating type erasure among their future plans.

That said, I really think type inference would help, and I'd be more than happy to reverse that typing to:

var empMap = new HashMap[String,Employee]();

There's no reason it needs to be any more complicated than that! None! The comiler should just copy the type on right side of the assignment, thus:

var empMap = new HashMap[String,Employee]();

//...would be equivalent to typing....

HashMap[String,Employee] empMap = new HashMap[String,Employee]();

Yeah, sure, we're no longer coding against the interface. But pretty soon Java and Java developers are going to have to stop all of the abstraction arm waving that basically is only useful for bragging at parties. (Do Java developers even have parties anymore? You know, where they actually talk about Java and not JavaFX?)

Type inference is only useful in local method variables, and thus there's need not be any risk of leaking implementation details into the class definition, which is where abstraction actually matters (and where Ruby is weak and Groovy's optional typing is a welcome feature). The var keyword would of course not be allowed on method member fields or method signatures (!).

There are about 10 things I think could be added to the Java compiler without any consequences, only improvements, to the code quality of a typical Java program. I'll save the rest for some other day Perhaps it warrants a blog post of my own -- I wish I was as committed a blogger as you Ola!

Cheers,
Clinton

Lars Westergren sa...

I am constantly amazed at how quickly you type Ola. Three big essays in a short while, just hours after I posted mine.

I really am going to have to polish my posts more if you are going to link to them for all the world to see.
;)

Anyway, thanks for the great answer.

Kjetil sa...

Instant type inference:

Map[String, Employee] = Maps.new();

Kjetil sa...

Ok, I really meant

Map[String, Employee] map = Maps.new();

Felix Leipold sa...

Type inference for local variables
would really help. Not only in writing code but especially in reading it. To quote from the linux kernel coding style:


Functions should be short and sweet, and do just one thing. They should fit on one or
two screenfuls of text (the ISO/ANSI screen size is 80×24, as we all know), and do one
thing and do that well.


The local variable declarations make this almost impossible to achieve, especially when using generics and useful variable names.

mernen sa...

Regarding the question, as you said, it depends on how much you're willing to inference. In a realistic situation, not much, definitely: Java is intentionally verbose so you (purportedly) know the type of variables right on the source code. So, people shouldn't expect "var" in Java anytime soon, and I kind of understand this vision.

Now, Ola, for your suggestion... why the equals sign at all? I'm asking because there are some ideas floating on the web to use just:

Map<String,Employee> empMap = new HashMap<>();

I like that, and I don't see a need for any special character in there. It's explicit enough to avoid ambiguities (and thus please Java's mindset), introduces no regressions, and saves enough dull typing to be a welcome feature.

Anonym sa...

@mernen (and Ola I suppose)

RE: Map[String,Employee] empMap = new HashMap[]();

This doesn't do anything. Nothing at all. It has no impact whatsoever. It's exactly the same as:

Map[String,Employee] empMap = new HashMap();

or

Map[String,Employee] empMap = new HashMap[String,Employee]();

It's just tossed. To screw this up you'd literally have to type:

Map[String,Employee] empMap = new HashMap[Employee,String](); //reversed

Which is only a risk if you type it at all... and is about the same class of error as:

String s = new Employee();

So just don't type it!

Clinton

mernen sa...

Clinton, point is, this <> syntax is not even valid nowadays, so no code would be broken with that kind of inference, and it would still work, should the JVM eventually support true generics (something that's also proposed for Java 7).

Anonym sa...

mernen, I said the very same thing in an earlier comment. I understand the reasoning. I'm saying it's the wrong direction to take Java IMHO... if it's not already too far gone.

There's no reason it needs to be anymore than simply:

var map = new HashMap[String,Person]();

For local variables there's no need for any abstraction beyond that. And if they support true generics in the future, that doesn't change the effectiveness here. It all still works great. We don't need more angle brackets.

Clinton

Anonym sa...

@Clinton

I've read in some post about improved type inference(unfortunately I don't remember the URL; gotta look through my bookmarks) that the problem with a declaration like

var map = new HashMap[String,Person]();

is that you are forced to use class types instead of interface types, i.e. the variable on the LHS is of type HashMap in this case. The compiler can't possibly know the type because it could as well be Serializable or Cloneable.

Anyway, I agree that Java desperately needs an improved type system for reducing the verbosity of parameterized type declarations. Perhaps even a bit mnore, i.e. type inference on method/constructor arguments or the receiver of a method call. I'd prefer that kinda change to almost all of the other proposals for Java 7 out there.

Cheers, Martin

Unknown sa...

@martin...

I thought I was quite clear about this before, but reading back, I guess I need to be more clear:

*It doesn't matter.*

Use the concrete class, forget the interface. Oh my! Did he just say that?! Yeah, I did.

We Java developers have to start getting over this obsession with abstraction. It is the root cause of a great deal of pain on the Java platform and it's where everything from Microsoft .NET to Ruby is killing Java.

I do use interfaces where the make sense. For example, on explosed class definition elements: return values, parameters of methods, and generally for member fields, even though they're not exposed.

But are you kidding me? What is the risk of using a concrete HashMap type as local variable in a method?

* How big are your methods?
* How often do you use anything other than a HashMap anyway?
* What is the cost of having to switch from a HashMap to a TreeMap?

Repeat those questions for ArrayList and HashSet.

Can you even name a public method on any concrete collection class that is not part of the corresponding interface? (no cheating)

Come on. It's time the Java Platform graduates from University and enters the practical world.

Cheers,
Clinton

Charles Oliver Nutter sa...

clinton: I'm not sure what you mean about Ruby being weak...weak at what exactly? Weak because there's no way to specify types?

Unknown sa...

Hi Charles,

>> Weak because there's no
>> way to specify types?

Yes. While I know it's entirely possible to create a custom type system for Ruby, it would be nice if it was somehow made part of the core. Groovy's optional typing is really nice.

While I don't miss typing my local variables inside methods, I think that it is important to allow for typing when defining a class. After all, we're defining a *type*, but in Ruby that means a class name and a bunch of untyped attributes, method params and return values. That's not much of a type definition, it's little more than a fancy hashtable. :-)

This doesn't help those of us interested in writing, say, an ORM or similar mapping framework. The lack of a single boolean type is rough too... I'll never understand that one.

So I'm starting to see little frameworks popping up that allow:

class Foo
properties {
integer :id
string :name
boolean :active
decimal :average,
{:access => :read_only}
complex :bar, Bar
}
end

This would define the accessor/mutator for each property and also create metadata that can be introspected to allow for more intelligent mappings between Ruby and databases, XML, web services, or whatever.

The problem compounds when you consider the possibility of immutable classes whereby a mapping framework would inject initialization values into the constructor (with corresponding read-only accessors). This is not impossible, but certainly less fun if the mapping framework needs to know the types of the constructor arguments (not uncommon).

So yeah, I think in this regard, Ruby would do well to support a very simple optional typing system.

Really, how hard would it be? What would the consequences be?

Cheers,
Clinton

Anonym sa...

Charles,

>>
I thought I was quite clear about this before, but reading back, I guess I need to be more clear:
>>

Doh! Apologies. I missed that paragraph from you first post... I really have to read more carefully. Anyway, you don't have to resort to sarcasm. Just point me to my mistake and I will quite happily say sorry. I was just reciting an article I've read and wanted to add something to the discussion...

Apart from that, I really don't understand people's constant and increasing annoyance with Java. Ok, I admit I would be quite happy if the current type system improved in such a way as to be able to omit the superfluous second occurence of type arguments in the generic constructor invocation. But that's about it. (Ok, some additional minor improvements would be OK as well).

Why do you want to change the language to something that is has never been designed to be? Just because all the hip new languages can do things Java can't? Just because Scala has built-in super type inference that allows it to be statically typed looking like a dynamic language at the same time? Now Java should have it, too? Sorry, I can't relate to that kinda thinking. I think Java's a great language. It serves me well and with my IDE's help I'm able to code pretty fast, too. I like that Java programs must declare the type of a variable. It makes my code clearer and it makes reading other people's code easier. That's my firm opinion.

That does not mean that I think Java's the only way to go. As it happens to be, I incorporated Groovy in my current project for some tasks (e.g. XML reading/writing) for which it is better suited than Java, because of it's concise syntax and new ideas. I bacame quite fond of Groovy and I will definitly use it for future projects. I have not looked into JRuby yet, but that's one of the next thing's I'm gonna do.

>>
I do use interfaces where the make sense. For example, on explosed class definition elements: return values, parameters of methods, and generally for member fields, even though they're not exposed.
>>

YOU do use interfaces where they make sense. But that doesn't mean any other developer will do the same if they got the choice...

>>
But are you kidding me? What is the risk of using a concrete HashMap type as local variable in a method?
>>

I'm not kidding. I just don't like it. Anyway, there's no risk I can think of right now. You're right that changing the type of Map is quite uncommon. And if it happens you can just search & replace. But again, my point is: Why on earth does Java need that "feature"? Your syntax would even introduce a new keyword...

>>
Can you even name a public method on any concrete collection class that is not part of the corresponding interface? (no cheating)
>>

No, I can't. Are there any at all?

>>
Come on. It's time the Java Platform graduates from University and enters the practical world.
>>

Than use a different language. There are more than enough to choose from...

Cheers, Martin

Unknown sa...

>> Just point me
>> to my mistake and I will quite
>> happily say sorry.

I wasn't being flippant there. I really did think I was clear, but I really wasn't. The statement I made was:

"Yeah, sure, we're no longer coding against the interface. But pretty soon Java and Java developers are going to have to stop all of the abstraction arm waving that basically is only useful for bragging at parties."

I see now that it wasn't clear enough.

>> Than use a different language.
>> There are more than enough to
>> choose from...

Oh trust me, I do when it's up to me, which is unfortunately rare.

So I will reserve the right to express my dissent about the language which I've invested 10 years of my career into. :-)

Clinton

Anonym sa...

Clinton,

Sorry, I probably took your statement too personal. As I said, I just didn't realize that you wrote essentially the same thing. If I had, your intention would have been clear ;) even though I can't relate... just yet.

>>
So I will reserve the right to express my dissent about the language which I've invested 10 years of my career into. :-)
>>

You've got all the right! I didn't mean to question that. I don't have nearly as much experience both in projects and in years than you and maybe I'll change my opinion someday and wish Java went to hell ;)

Cheers, Martin

Unknown sa...

Hey Martin,

No worries. To help you along on the road to fully realized Java anger, here's a more complete opinion (blogspam, sorry):

http://www.clintonbegin.com/2008/02/clintons-java-5-rant.html

There are also followup posts regarding backward compatibility (Backward Compatibility be Damned) and my history with Java zealotry (Party like it's 2002).

I'm probably going to switch my blog to a wiki soon (with comments), so we can turn my general rants and opinions into a working copy of how Java can be made better.

Cheers,
Clinton

GM sa...

Less typing is far more important than it sounds, because it means less mistakes. Unlike me or you, the type inferencer will always infer the most general type. Less _re_typing also means easier refactoring which is a big benefit.