Java Generics II

In Part I I showed how gener­ics have no effect on the byte-​codes gen­er­at­ed by the com­pil­er and that as far as the get­ters of the col­lec­tion class­es the only tan­gi­ble ben­e­fit is that you no longer have to add casts to the get expres­sion. So some­thing like this:

becomes this:

Now don’t get me wrong, I think that this is a good thing. As far as I’m con­cerned, casts are just wast­ed typ­ing that is only for the com­pil­er. So get­ting rid of casts is great. But the prob­lem is that to get this fea­ture you have use an even ugli­er syn­tax (gener­ics) in oth­er places. So the method this line is in might look like this:

Now I will grant you that the casts were in the body which obfus­cates the code you’re try­ing to read but now are in the func­tion def­i­n­i­tion and that if you are using the val­ues more than once you’ve trad­ed _​N_​casts for 1 def­i­n­i­tion. But the flip side of this is that you have now moved an imple­men­ta­tion detail into the pub­lic inter­face. In the old style this method would have said “I take a Map and do some­thing with it” but the new one says “I take a Map of String to String and do some­thing with it.” More on this in a lat­er post.

So I have an idea for a pro­posed lan­guage change that I wish I could have made (and got­ten imple­ment­ed 😉 years ago. It is a sim­ple change but I think it would have made a major pos­i­tive change to the lan­guage.

Why not remove the need for casts for down cast­ing?

If you take the basic assign­ment

The types X and Y are relat­ed in one of four ways (assum­ing nei­ther is a prim­i­tive):

  1. X and Y are the same type.
  2. X is not direct­ly relat­ed to Y.
  3. X is a super­class or inter­face of Y (upcast­ing).
  4. X is a sub­class of Y (down­cast­ing).

In case #1 no cast is nec­es­sary. In case #2 the state­ment is invalid and adding a cast will not change that. In case #3 no cast is nec­es­sary. In case #4 a cast is need­ed to com­pile and a checkcast byte-​code is gen­er­at­ed to val­i­date the assign­ment at run­time. If you look at the casts it turns out that down cast­ing is the only one that we can change and it accounts for most of the casts in a pre-​generics pro­gram.

The inter­est­ing thing about the down­cast case is that the com­pil­er checks to see if a cast could work (are the types relat­ed) but it still emits the checkcast byte-​code. So my ques­tion is if the com­pil­er already checks the assign­ment is valid and gen­er­ates a checkcast byte-​code any­way why can’t we just tell the com­pil­er “if you see a valid down­cast, just emit a checkcast byte-​code with­out requir­ing the cast”?.

To make this con­crete I want to change this

into this:

If the com­pil­er allowed this the sys­tem would be no less safe as the checkcast byte-​code would still be emit­ted to check at run­time but the code is clean­er. As far as I can tell the only rea­son the cast is required (here I’m putting on my Mind Reading Through Time Helmet) is that some­one said some­thing like “If we tell the pro­gram­mer that they are down­cast­ing and that this is a dicey oper­a­tion they will check their code to ver­i­fy that this down­cast is valid at this point and then add a cast to tell the com­pil­er that they val­i­dat­ed the code.” This sounds good but let’s be hon­est, do you real­ly check your code to ver­i­fy the cast in all cas­es or do you most­ly just add the cast because the com­pil­er wants it?

I thought so. But even if you check your code now there is no way to pre­vent a change lat­er doing the wrong thing or if you take a para­me­ter there is no way to pre­vent some­one else from mess­ing you up. This is why the com­pil­er emits the checkcast byte-​code even though you added a cast. It is too easy make mis­takes.

So giv­en that most of the time the cast is just added to shut up the com­pil­er and that the sys­tem still adds a byte-​code to pre­vent mis­takes why not just get rid of the cast require­ment? Just think how sim­ple the col­lec­tion class­es would be to use. They already do not need a cast to put object in and with this change they would not need a cast to get the val­ue out and the gen­er­at­ed code would be iden­ti­cal to the code cur­rent­ly gen­er­at­ed (see Part I).

Casts are intru­sive, ugly, unnec­es­sary for under­stand­ing the code, require dupli­cate typ­ing

and do not make the code any safer. If this change had been made to the lan­guage in any ver­sion before 1.5 I believe that there would have been a lot less demand for Generics.

Leave a Reply