Saturday, September 30, 2006

wxWorld

I'm pleased to be able to link to a new article: Build cross-platform GUIs using wxWidgets available on the IBM developerWorks site. The original title was "wxWorld", and it's a quick look at wxPython, the wxWidgets toolkit, and some of the other wxWidgets language bindings. I had some fun digging through the different language tools trying to create short wx programs in each. Hope you like it.

Friday, September 22, 2006

Tips-First for Test-First

Of all the exciting ideas and revelations that came from Kent Beck's original XP book, Test-First Programming has been the one that most significantly affected the way I work on a day-to-day basis.

I love programming test-first. It's a great way to take a large, amorphous task and solve it piece by piece. It's also a nice morale boost -- "Hey, I know that my code does nine things. Let's go for ten..."

Here are a bunch of things I wish somebody had told me about test-first programming.

Unit Testing is not All Testing

So, after I started doing test-first, I walked around for about six months all, "my code is perfect because I wrote tests". My smugness came crashing down when testers found some bugs. My code was better because I wrote tests, but turns out I had made some dicey assumptions about the inputs, and so my tests passed, but were still incorrect. Test-first is not a complete test suite. You still need to do acceptance testing, you still need to do GUI testing where appropriate.

However, you can still automate a very large percentage of acceptance tests. The more you can automate the tests, the more they'll be run, and the happier you'll be.

Test-First is a structure for writing good code at least as much as it's a means for verifying code

Code that has been written in a test-first style tends to have certain qualities. Small methods, loosely coupled. Small objects, loosely coupled. Code that causes side effects (such as output) tends to be separated from code that doesn't. These are all side effects of what's easy to test -- it's easy to write small methods in a tight test-first loop. And dependencies between methods or objects make tests harder to write.

As it happens, those exact qualities -- tight cohesion and loose coupling -- are exactly what characterizes the best software architectures. My test-first experience is that I wind up with much better code architectures from test-first then I do when I try to guess the design before I start. (Which is not to say that a little bit of pre-design can't be helpful, just that it can be overdone).

Test-first is better suited for some things than others

That doesn't mean that you shouldn't try, of course. Test-first is vital in cases where you know the input and output, but not the process. It's also critical in cases where your program can be incorrect in subtle ways. It's somewhat less important for things that will visibly or loudly break. GUI's are a challenge because GUI layouts tend to change in ways that can break unit tests. GUI behaviors are more stable and easier to test. Again, though, you should try and have automated coverage of even those areas that weren't developed test-first.

Trust the process. Look ahead on tests, not implementation.

It works. The tight process is: Write a test. Run the test so that it fails. Make the simplest fix that will pass the test. Run the test so that it passes. Refactor. Keep to that tight loop. Resist the temptation to guess about what you'll need to pass the next test. What I usually do is put a list of the tests I'm going to write in comments in my test class -- that's my lookahead plan, and keeps me from forgetting something. But the design I do in my actual code comes during the refactor step, which is where I see duplication and abstraction.

The earlier you start the better off you are.

It gets increasingly hard to convert a code base to test-first the longer you wait. I've even had 20 line classes that needed significant refactoring to unit test (mostly because output was intertwined with functionality -- the code was better after the refactoring). Test-first is a good place to start, anyway -- pick something to test, and go.

Treat your tests like code and refactor them

Pretty much every test-first guide gives sort of a perfunctory nod to, "oh yeah, keep your test code clean." But I think this could stand a little more attention. For one thing, your unit tests are critically important to your ability to deliver quality code -- and they have no tests of their own. The cleaner your tests are, the better you'll be able to see issues with the tests themselves.

One thing that has worked nicely for me is extracting sets of assertions into custom assert methods. If you are continually making the same five assertions to check validity of your objects, throw them into an assert_instance method of some kind. Another common case is making the same assertion over a range of values -- move the for loop to your custom assertion and pass the range endpoints in.

There are two big advantages to doing this consistently. The first is that it's easier to see what's going on from one line of assert_person(expected_name, expected_addr) then from five lines of assert_equals. The second is that it ensures that you actually do make all the assertions every time. Hey, everybody slacks, and test-first is about making the test setups as quickly as possible. If you can trigger all umpteen tests on your class with one method call, you're more likely to do the whole set every time, rather than just picking one or two at random each time.

Don't reuse instance variables

This is another refactoring issue. It's tempting as you add new unit test cases to do something like this:
Person p = new Person("noel", "david", "rappin");
assertEquals(15, p.nameLength());
p.setLastName("flintstone");
assertEquals(19, p.nameLength());
p.setFirstName("pebbles")
assertEquals(22, p.nameLength());
p.setFirstName("betty");
assertEquals(22, p.nameLength());

The last test fails -- quick, what's it testing? Okay, now we have to trace the life of that instance variable all the way back up. It's hard to read, and prone to dangerous errors. You should never reuse an instance variable like this in a unit test -- every assertion should, where its at all feasible, be completely distinct:

assertNameLength(int expected, String first, String middle, String last) {
Person p = new Person(first, middle, last);
assertEquals(expected, p.nameLength())
};

testNameLength() {
assertNameLength(15, "noel", "david", "rappin");
assertNameLength(19, "noel", "david", "flintstone);
assertNameLength(22, "pebbles", "david", "flintstone");
assertNameLength(22, "betty", "david", "flintstone");
}
Now when the last test fails, you can actually see what's going on.

Avoid tautologies

The scariest issue you can have with tests is a test that passes when it should fail, allowing you to continue blithely along, ignorant of a bug you should have already caught. There will come a day, for instance, when you will forget to put any assertions in a test. There are a couple of things you can do to make tautologies less likely.
  1. Follow the process. The process says each test has to fail before you add code. Adding tests that you already know will pass can easily lead to writing a test that will never fail.

  2. If you have constants for text or numerical values in your code, don't reuse those in the tests -- use the literal or create a separate constant in the test.

  3. Be careful with Mock Objects. Try not to test the things that you are explicitly inserting in the Mock when it's created.

Mock Objects Rule

Mock Objects are the missing link in helping you test all the things that are traditionally hard to unit test, like databases, GUI, web server... anything where your code is dependent on an external system or person, the Mock can get in the way and pretend to be that third-party and allow you to send and receive data in a testable way. Mock Object packages exist for a variety of languages, and using a package will save you time and effort on your tests.

Hope that helps -- go out and test something.

Friday, September 15, 2006

Why, Johnny, Why?

We interrupt Python week to bring you the following alternative programming rant. I know, Python week has sort of gone up in smoke. But one of our mottoes here is "Whenever a Hugo Award winning SF novelist writes a hyperbolic screed about BASIC in the public schools, 10 Print Hello will be there". As a motto, it's not very catchy. We're working on it.

As soon as I mentioned "Hugo Award winner", "BASIC" and "hyperbolic screed" many of you were probably able to quickly deduce that the author is David Brin, here on Salon wondering what happened to BASIC (you'll have to watch an ad to view the article):

Only, quietly and without fanfare, or even any comment or notice by software pundits, we have drifted into a situation where almost none of the millions of personal computers in America offers a line-programming language simple enough for kids to pick up fast. Not even the one that was a software lingua franca on nearly all machines, only a decade or so ago. And that is not only a problem for Ben and me; it is a problem for our nation and civilization.
Does he have your attention yet? He'll equate the loss of BASIC to an act of war later in the essay. Brin seems to be making three separate points:
  • BASIC used to be available on all computers that kids touch, and that is no longer the case.
This is obviously true, but a bit less dramatic than Brin implies.

Brin implies that BASIC was available for kids for a long time, and only recently disappeared. Actually, that's close to backwards. BASIC was generally available for less than a decade, and has been fading ever since. Even though Brin says a couple of times that "20 years ago" millions of kids could have used BASIC, the fact is that by 1986 BASIC was on its way out as a standard part of home computers.

Although it was invented in the early 60's, BASIC is associated most strongly with the late '70s and early 80s generation of computers. This market would eventually be dominated by the Apple ][ line, but earlier included things like the TRS/80, and Texas Instruments. (I even remember the Bally TV based game system having a BASIC module circa 1981 or so.). In any case, neither the Mac (introduced in 1984) or the IBM PC and clones featured BASIC to that same degree. By 1985, the idea that all computers would have BASIC was much less strong, although Apple ][ and BASIC instruction lingered in schools for a few years after that Which is why Brin's son is still seeing it in math textbooks, although that says more about the textbook industry than anything else. Data point -- my middle school had an Apple ][ computer lab in 1984 or 5. In my high school, a couple of years later, the computers were already Mac & PC's without basic -- we learned Pascal.
  • There is much less of an sense that kids should be taught programming (particularly in BASIC) than there may have been in the mid 80's.
Largely true. Another data point -- my younger relatives about ten years later were no longer taught BASIC, nor were the kids at elementary schools I studied about the same time. By now, computers had migrated into the actual classroom and were being used as reference and also for what I guess you'd have to call multimedia authoring. I do think this is a loss. But at the same time, I've always kind of suspected that the reason why elementary school kids were taught BASIC in the early 80s was because the schools were kind of floundering around for what to do with the shiny new computers. My read of the educational literature during the time I was studying educational technology was that eventually this petered out because it was not clear that teaching programming was helping students become better general learners. To be fair, that's not exactly the point Brin is arguing, but it does suggest that, perhaps, losing BASIC is not the end of civilization.
  • BASIC has some magical set of properties (Brin calls it "line-programming") that makes it uniquely suitable for introducing programming concepts. Because of this, we're losing an entire generation of tinkerers. This, I don't get at all.
So, Brin goes on at some length about how the computer people he's talked to don't seem to feel that it's a problem that BASIC isn't around anymore, while he, Brin, knows better. (Anybody familiar with Brin's essay entitled "The Dogma of Otherness" should catch at least a hint of irony.) Anyway, while Brin does acknowledge that BASIC has a lot of limitations, he goes on at length about line-programming and how important it is.

I'm not completely sure what Brin means by line-programming. Google didn't give me a relevant link. I'm going to assume that it has something to do with the fact that BASIC circa the Applesoft years was coded on a line-by-line basis. Brin suggests that this was an experience that modern languages can't give:
The "scripting" languages that serve as entry-level tools for today's aspiring programmers -- like Perl and Python -- don't make this experience accessible to students in the same way. BASIC was close enough to the algorithm that you could actually follow the reasoning of the machine as it made choices and followed logical pathways. Repeating this point for emphasis: You could even do it all yourself, following along on paper, for a few iterations, verifying that the dot on the screen was moving by the sheer power of mathematics, alone. Wow! (Indeed, I would love to sit with my son and write "Pong" from scratch. The rule set -- the math -- is so simple. And he would never see the world the same, no matter how many higher-level languages he then moves on to.)
I confess, I have no idea what he's saying here, though I do like the scare-quotes around "scripting". I'm kind of trying to get my head around the idea that you can't program a mathematical algorithm in Python and follow the reasoning of the machine. I mean, there are higher level constructs, but if we're talking about loops and conditionals for 5th grade math problems... I think Python would be pretty easy to follow and would look a lot like the logical structure of the algorithm. Python even has an interactive interpreter so you could type the code in line-by-line if you wanted. That actually could be pretty cool in a learning-math setting. And you could even track it with pencil and paper.

It is true, though, that it was much more conceptually simple to do simple graphics in Applesoft BASIC than in Pascal. That's not the language's fault, and it's not because you write Python in a full text editor. It's because modern programming languages sit on an operating system that mediates access to the drawing controls, and Applesoft BASIC didn't. It wouldn't be hard to come up with a Python package that emulated the draw controls of Applesoft basic (which were on the order of "Make that pixel blue. Now make that one red").

I think what I object to is the implication that this is somehow a difficult time to be learning to program, that it's harder now to get into programming. That's totally wrong -- it's a fabulous time to be learning to program. Brin says his son is now learning C++, so I'll assume he's interested and motivated. Twenty years ago, yeah, he would have had BASIC. And that's it. Unless you wanted to pay some money. As for seeing any examples of what a real program looked like, forget it.

These days... Well, a Mac OS X box ships with what, a half-dozen or so programming languages right out of the box, with who knows how many all available for free. Want to see basic algorithm code for free? It's there. All kind of code, complex and simple, is available online. There's a whole industry of programming books, something I would have devoured as a kid. We've traded The One True Teaching Language for many different languages. An elementary school teacher explaining "20 goto 10" is now a publishing empire, plus the internet. Coloring individual dots on a screen is now building a web page, or a web application, or a sprite animation. Even an elementary school child who is motivated can do more and understand more about computers than I would have dreamed in 1985. Have we lost something? Maybe. Have we gained something? Oh yes.

Wednesday, September 13, 2006

Obligatory Apple Post

Since what every tech blog reader needs is another round up of Apple's Showtime event...

Overall, nice incremental stuff, perhaps a little disappointing to those who were expecting a radical new mainline iPod.

  • New Shuffle: This is getting close to being jewelry, actually is starting to look like a cufflink to me.

  • New Nano: Smaller, bigger drive, better battery life, colors. Solid incremental upgrade.

  • New iPod: A very small incremental upgrade. It's very irritating that the new search function is not being backported to existing video iPods.

    Here's my bold prediction -- the widescreen, touch panel iPod will never be released as it is currently rumored. My guess is that there will be practical troubles either keeping the screen clean and scratch free when people are putting their grubby mitts right on it and/or having a touch screen wheel UI that is actually usable. More likely the former, but I think the latter might be a problem too. Just wait for me to be wrong!

  • iTunes UI Enhancements: Mostly quite nice. The album art view and the cover flow view are both very pretty. They don't really mesh with how I use iTunes, but I can see where they will be useful. Points to Apple for actually buying CoverFlow rather than just ripping it off. The library enhancements and the iPod view are nice (although the iPod prefs aren't that much nicer than they were in the Preferences screen). As usual, the UI is in a state of flux, it's a bit more subdued, but some elements (like the tabs in the iPod screen just look weird.

  • iTunes Movies: Best seen as the beginning of a long term strategy than a goal in itself. Although if I were the manufacturer of a portable DVD player, I'd start getting a little worried. Still, I can't imaging genuinely sitting through a full-length widescreen movie on an iPod screen unless I was trapped on an airplane.

  • iTunes Games: Kind of underplayed, but in some ways the most interesting potential. If, that is, Apple releases an SDK such that the open source hackers of the world can get their hand on it. That could be very, very interesting... UPDATE: Looks like Apple has "no plans to offer an SDK". Bummer. Developers should be agitating for this.

  • iTV: I get where they are going with this, and I really want to like it, but absent DVR capability (even if it was on the networked Mac) I really can't see this being a major player.
Python tie-in (what with it being Python Week and all...): nothing really. I do have a pretty cool and obsessive Python script that creates fancy random playlists for iTunes/iPod, including things like, randomly pick albums, randomly play multiple songs in a row from the same artist, play songs in specific genres more often, etc., etc. I'll post it here someday, but the code really needs a good sweep first.

Tuesday, September 12, 2006

Re-refactoring

Here's a little riff inspired by one of the examples in Martin Fowler's book Refactoring, which is another great programming book that deserves an appreciation post one of these days. This was actually also spawned by code that I've read, and later realized that Fowler did a similar example. Thing is, I don't think Fowler went far enough in this case.

Here's the example. (page 243 for those of you playing the home game). But, since it's Python Week here, I'll translate to Python.

if isSpecialDeal():
total = price * 0.95
send()
else:
total = price * 0.98
send()
Fowler correctly notices the duplicate call to send(), and refactors to:
if isSpecialDeal():
total = price * 0.95
else:
total = price * 0.98
send()
This is fine as far as it goes, but as I see it, there's a second duplication in this snippet -- the formula for calculating the total. I'd rather see something like this (using the new Python 2.5 ternary syntax:
multiplier = (0.95 if isSpecialDeal() else 0.98)
total = price * multipiler
send()
There are a couple of advantages to this last snippet. We've separated the calculation of the total from the act of gathering the data for that calculation. This makes the actual formula for the total clearer, and allows you to easily spawn the multiplier getter off to it's own method if it gets more complicated. Plus we've removed more duplication, and I think made the code structure match the logical structure of the calculation a little bit better.

This is a simple example, and you could quibble with it. The general idea of separating conditional logic from calculations is a solid way to clean up code and make it easier to maintain in the long run.

Before I leave... I'm not sold on the syntax for the Python ternary yet. I'm told that the syntax was chosen over the perhaps more consistent if cond then x else y end because it was felt that in most use cases you'd have a clear preferred choice and a clear alternate choice, and putting the preferred choice before the conditional emphasized that. I don't know if that matches how I'd use a ternary. Although I guess it's reminiscent of listcomp syntax. I need to use it in real programs to know for sure.

Monday, September 11, 2006

Some 411 of my own

Saturday, Robin and I had the pleasure of being interviewed by Ron Stephens for the excellent Python 411 podcast. I think this was the first time I've ever been interviewed for anything, and while it's always fun to talk about Python, the book, and me (not necessarily in that order), it does take some getting used to.

Anyway, I do mention this here blog during the interview, and while I don't want to talk about the actual interview in detail until I hear the edited version, it did occur to me that I might want to have some actual Python content on board in case anybody comes by to check the place out.

Python content all week, then, starting with today's Things I Love About Python:

  1. Whitespace. I know that I said just a few short days ago that I wasn't going to redefend Python's whitespace blocks. That was then, this is now. Now, I'm just going to gush over them. I love using whitespace to mark blocks. It enforces what I'd be doing anyway. It encourages consistent style, with the result that other people's Python code is actually intelligible. It encourages short methods and shallow nesting, both good habits, and it lets you get about 10-25% more code on a page. Nobody is ever going to have an Obfuscated Python contest (okay, I looked it up... somebody has, but they realize it's a joke.

  2. List Comprehensions. One of my favorite syntax features in any language. So concise and yet so clear... Try to describe the following any more clearly in any language, programming or not.

    [x.name for x in students if x.grade > 90]

    Okay, they do sometimes blow up if you make them too complicated.

  3. First Class Functions. It's easier to pass around named function objects in Python than in just about any language not named Lisp. This is a very good thing. It enables all kinds of elegant abstractions (especially since classes and instances can all be made callable). Over time, using Python has made all my coding move to a more functional style that's easier to test, verify, and maintain.
Of course, not everything in Python needs to be elegant and abstracted. Last night I had a problem. I wanted to download all episodes of a popular podcast that does not have an easily accessible archive page. Rather that walk through months of postings, I decided to write a script that would take advantage of the pages naming conventions, loop to find the shows for given days, find the downloadable URL and download, then add to iTunes. Final code, just under 60 lines. Elapsed time, under 45 minutes start to first download, including downloading, installing, and using a new library (Beautiful Soup, which is a nice HTML parser). The point is not that I'm particularly good at this (the script is a little sloppy and doesn't handle error conditions well), but that Python is particularly good at this. Plus, it was fun -- no fighting with compilers and interpreters, able to find support for the libraries when I needed it.

Saturday, September 09, 2006

Hybrids In Bloom

A couple of big stories in the wide world of scripting languages running on virtual machine platforms.

  • IronPython, the .NET implementation of Python created by Jython creator Jim Hugunin, released version 1.0.

  • The two primary developers of the JRuby project, implementing a Java-based Ruby interpreter, were hired by Sun with the mandate to bring JRuby to 1.0.
Unsurprisingly, I think this is all great. Programming hybrids are a beautiful thing. The more tools the merrier, and the more ways to combine the best parts of different tools, the merrier squared.

The amazing thing about IronPython is that it benchmarks as being faster than traditional CPython, which seems sort of counter-intuitive. (I'm assuming, based on nothing at all, that there's a higher memory load, but if I'm wrong, I'm sure somebody will point that out).

One of the interesting things about JRuby is a certain shift in momentum. When Jim Hugunin created JPython, the primary goal was to be able to use existing Java libraries with Python syntax. The JRuby team (and by extension, Sun), in contrast, seem to be comparatively more interested in using existing Ruby libraries (like Campfire and Rails) on a JVM backdrop than in using existing Java classes.

For instance, JRuby does not seem to have an analogue to the Jython shortcut of using converting attribute assignment in Jython (foo.bar = 3) to a Java bean setter (foo.setbar(3)), which makes Java classes feel more Pythonic. (Again, correct me if wrong, the existing tutorials don't touch on this point, and I'm basing this on a possibly out-of-date article). (UPDATE: Somebody did correct me -- JRuby does have this style. I wonder if it also works on constructors the way Jython does. So the point below is somewhat invalidated.)

And I don't mean this as a good/bad thing, either -- it's perfectly all right for the different tools to have different priorities. It's fascinating that Ruby is now seen as bringing a host of useful tools to the Java platform, in a way that I think we would have been laughed at a few years ago for suggesting that strongly about Jython.

Good luck and more power to everyone, I can't wait for Java on Rails...

Thursday, September 07, 2006

I/O, I/O, It's Off To Work I Go

Welcome to our program, Things I Agree With Totally And Wish I Had Said First. Our hero tonight is Tim Ottinger with his hit, "Frameworks are for the Impatient". It seems Ottinger is puzzled by a library he's trying to use..

Look, this framework is not the game Myst. I did not install this thing so that I could amuse myself for days by running around the file system trying to figure out what it is about...

In the Java world... [E]ven opening and reading a file is cause to go google the library one more time. Heaven forbid you have to manipulate dates or the like. These are small things, and should be very easy.... You shouldn't have to crack open a half-dozen US$50.00 books.

One point of "obvious" is probably worth one hundred points of "clever".
In the immortal words of Arnold Horshack, "Right you are, Mr. Kotter". Hmm. That sounds better if you imagine it in a Horshack voice. Doesn't look like much in print...

The Java io library is a particular nemesis of mine. I've been using it for what, just under a decade now, and the only way I was able to stop looking up the API every other week was to write a utility class that had an API that was actually useful (you know, obscure methods like copyFile, readlines, writeToFile...).

The io library may actually be the purest example of the Java school of OO design, marked by the principles like:
  1. Use an abstraction (streams) that looks interesting on paper, but that nobody ever uses. Ignore the abstraction (files) that is already established. When that doesn't work, add a completely new abstraction (readers/writers) that's completely non-interoperable with the first one.

  2. Make sure the user has to type a lot of words to get anything done.

  3. Make it just as easy to do obscure, complicated things as it is to do typical things, even if this means making it harder to do normal tasks. Doesn't everybody want to randomly access binary files just as often as write some text? (Actually, there are obscure things in the API that are much easier than say, iterating over the lines in a text file).

Don't even get me started on the nio package. Really, don't. I couldn't explain it on a bet.

Sigh. Java is too easy a target sometimes.

Wednesday, September 06, 2006

Fonts

I'm curious -- how do you set up your screen in your text editor when you are programming?

Based on people I've worked with, I seem to do two things in my setup that are unusual. I use fairly large fonts (16-18 point, if I can) and I'm aggressive about cutting off lines at 80 characters. The upshot is that I'm showing less text on the screen at a time than most programmers I know.

Good? Bad? Not sure. The 80 character habit came from I think a combination of Code Complete where it's recommended on the (outdated now) premise that that's as many characters as you could print on a line, plus doing the books, where you generally do have to cut the examples off at 72 or 80 characters, plus some nasty HTML bugs where somebody tried to do a whole table in one line, and there was a missing </td> in column 436 or something. I'll stretch beyond 80 characters, but I really don't like having code hang that I
have to scroll right to see.

The bigger font I think is more of a aesthetic choice (and not, say, a vision issue). I do like that it tends to focus me on one method at a time, and encourages me to keep methods shorter to stay on one screen.

A couple of years ago, I did a little internet search on fonts for programming. Among other things, I found there are people who really like programming with a 7 or 9 point font. Anyway, I picked Bitstream Vera Sans Mono (and also the open source twin Deja Vu Sans Mono) because it is a bit heavier weight than Courier, a lot prettier to look at, and it's specifically designed to differentiate between similar characters.

Monday, September 04, 2006

Web Apps and Language Wars

I wasn't planning on posting about either web apps or linking to Joel Spolsky again, but this language wars post is just too interesting to pass up. Besides, a jillion people have already commented on this, so what's a jillion and one?

Spolsky is riffing on what language or platform you should use for an enterprise web project. He makes a few points (note, I'm paraphrasing him here -- these are his points, not mine):

  1. There are 3 1/2 platforms that are proven to work in the enterprise web app space (Java, C#, PHP, and maybe Python.

  2. Within that group, there's no difference large enough to offset expertise, so pick the one that you know the best.

  3. Rails is not part of that group. Even though it's fun.
I suspect you know which one of those three points has gotten the most attention. Obviously the Rails people are ticked off, which I think is a combination of Spolsky taking his point too far, and Rails partisans taking his point even farther.

Look, I love Rails as much as a person can love a framework. I wish I had been smart enough to put all the pieces together myself (another post for another time...). My Rails experience has been uniformly positive. Nevertheless, if I wanted to pitch Rails for a mission-critical enterprise application, I would expect to have to justify the choice. Using Rails is still a risk, relative to the others, it's still newer, people are still trying to work out optimal deployment, it still doesn't have the library support the others have. Where I would differ from what Spolsky is saying is that I think it might be a justifiable risk even in a mission-critical enterprise application.

Scaling and library support is not the only source of risk. There's also the risk that your code will get bogged down in a huge ball of intertangled display and logic code (PHP). Or the risk that your developer time will be slowed down enough that it delays deployment (Java). Or the risk of deploying in a system that is owned by Microsoft (guess...). Choosing one of the "nobody ever got fired for choosing X" languages is a safer choice. Which doesn't always make it the best choice.

(And yes, I know that Spolsky ends his essay by mentioning that one of his apps is written in a custom in-house version of VBScript. Red herring. He's not saying that Java, C#, and PHP are the only languages to use ever, just that they are the only languages that currently have the ecological support to be guaranteed safe in a "death before failure" scenario.)

I'd argue the following corollary: I agree that, all else being equal, expertise trumps any difference between these platforms. That's a little circular, of course, because how will you get experience without using a tool? (I know about apprenticeship as a junior member on a larger project, but it's not always feasible.) Almost every project or team spins off low-level applications -- bug trackers, vacation trackers, internal chat rooms. Things that are not high-priority, but are still useful. So, when putting those together, I think it's a good idea to range far and wide and try new things that might pay off in a future project (I almost wrote that you "have the right, no, the duty" to do that, but I thought that might be a little over the top). Me, I'm going to try out Python/Django next chance I get...

Friday, September 01, 2006

Java Closures

Here's a nice item being proposed for Java 1.7: closures in Java. On behalf of all those people who actually do create entire classes just to be able to use map and other functional styles in Java, may I say, please, please, please put this in Java. (This seems a good place to link to Joel Spolsky's wonderful programming fable "Can Your Programming Language Do This").

The proposed syntax looks like this:

public static void main(String[] args) {
int plus2(int x) { return x+2; }
int(int) plus2b = plus2;
System.out.println(plus2b(2));
}
Line one of that syntax creates a closure object that takes and returns an int using what is basically Java method syntax. Line two assigns that closure to another variable using the syntax int(int) to specify the types of the signature. Line three shows that you can call the closure object as you'd expect, although notice that, unlike most Java calls, there's no receiver object specified, and it's not using an implicit this -- it's purely a function.

The proposal also specifies an alternate syntax for creating short closure objects -- I don't like this one as much:

int(int) plus2b = (int x) : x+2;

This is all nice, and I know I'd use it pretty much daily. Unfortunately, though, I wonder if the strict typing will wind up making the closures less useful than, say, Ruby blocks. I assume there'd be some way to tie this into the generics system so that methods that might take blocks with different type signatures would be able to convince the compiler that everything is okay. Let's see... if I wanted to do a new method of List collect, it would be something like this.
public List<V> collect(V(T) closure) {
List
<V> result = new ArrayList<V>();
for (T obj : iterator()) {
result.add(closure(obj))
}
return result;
}

int plus2(int x) { return x+2 };
List fred = list.collect(plus2);
Is that right? If so, that's certainly a lot better than we have now.

I have three quibbles and an enhancement.

Quibble 1: Like generics, what looks nice for int(int) is going to look a lot less pleasant when the signature is, say, OrderLineItem(Order, Product) or even better List<List<OrderLineItem>>(List<Order>, List<Customer>, List<Product>), which I could easily see as a real world case.

Quibble 2: Do do this right would require including support for closures up and down the standard library -- all through the util classes, all through SWING, JDBC -- there's all sorts of places in the library that would be cleaned up by being able to take closures. I suspect that's unlikely to happen quickly.

Quibble 3: The proposal says "We are also experimenting with generalizing this to support an invocation syntax that interleaves parts of the method name and its arguments, which would allow more general user-defined control structures that look like if, if-else, do-while, and so on." I'm thinking this is more of a Smalltalk or Objective-C style? That would look odd within Java.

What I really want, though is a method literal analogous to a class literal. Something like...

Integer(MyClass) closure = MyClass.someSillyThing.method;
MyClass obj = new MyClass();
Integer x = obj.closure(3);

or even better:

MyClass obj = new MyClass();
Integer(MyClass) closure = obj.someSillyThing.method;
Integer x = closure(3);

And yes, I realize that's basically Python bound and unbound methods. The tricky part in Java is that there might be more than one overloaded method called someSillyThing, and so I'm assuming that whatever closure object I'm creating would be able to get the right one based on the declared type (or, alternately, I suppose, dispatch properly when called). That should be doable, though. And then my Java code can look even more like Python...

Good stuff. I hope something like this gets in.