Wednesday, May 30, 2007

Text Auditor

I hate Microsoft Word. I feel that I came by this rightfully, after a whole book's worth of numbered lists that refused to line up, images that refused to stay put, and the truly irritating indexing interface.

For a long time I'd sort of rant about how word processing was this core user task and we still couldn't figure out the right UI for it. I'm not totally backing off that, although I've also read people ranting that text editing was solved 20 years ago by Emacs and Vi, and there's no point in looking at anything new, which achieves a level of crankiness that makes me look calm.

What occurred to me as I was thinking about writing about text entry and UI was the sheer number of different programs that I use for various kinds of word processing or text editing. Most of these are optimized for a specific set of activities, and do that particular job well enough that it's worth learning a new program in order to get that benefit.

More and more, I find myself also using tools that help me with text cross-applications, like generic text expansion tools, or clipboard history tools, or even monster aggregators like Quicksilver. Put that way, it sounds a little bit like the OpenDoc dream, where your spell checker would come from one place, and your text editor from another, and the font manager in from another. But OpenDoc was about components instead of applications and looking at the list below, it's pretty clear that applications aren't going away anytime soon...

So here's a list of all the apps I can think of that I do some kind of text edit in, with some comments.

Adium/Pidgin -- IM Clients (Adium Mac side, Pidgin in Windows, plus another couple for internal work things). None of these has a ton of text editing support, although I do wind up typing a fair amount of text into them over the course of day. Pidgin has a great name, but I really wish it had one piece of AIM functionality, namely the ability to read text in a different size then you send it.

Eclipse -- Currently it's my Windows side Ruby/Rails editor of choice, although I'll probably change my mind six more times before I settle on something. It's also my Windows Perl editor of choice, as it seems to be the only free editor that does syntax checking.

GMail/Google Docs -- Gmail, obviously, used for email, although I do sometimes use it as quick storage for text I want to be able to work on from multiple locations. You're supposed to use Google Docs for that kind of thing, and I do, sometimes. It's great for collaborating (we used it for final notes on the manuscript for the wx book, for example). But for normal use, it's just a hair too unresponsive, and the UI is just a touch awkward within the browser.

IntelliJ IDEA -- Java editor of choice. Still the most fully featured of the big Java IDE's, although the other's are catching up. A big memory hog, though.

jEdit -- Window's side default text editor, especially for Python. Been using it for years. Very nice feature set, I'm used to the controls, decent syntax coding for the languages I use.

Jer's Novel Writer -- I don't use this as much as I'd like, but I'm putting it in here as a great example of tuning a text entry program to a specific use. In this case, writing fiction. Nice features include easy annotating and the ability to specify separate display and printing formats -- very useful for keeping your print document in manuscript format.

MarsEdit -- Well, I'm writing this post in it. Mac side editor of choice for blog posts. Why did I spend $25 on this when it's functionality is pretty easily replicated from TextMate? Good question. It's got a very clean UI, and I like that it previews while you type. And if I need to do something fancy, it interfaces with TextMate.

MS Word -- Windows word processor when forced to use it. Word actually isn't that that bad if you stay within the confines of a four-page office memo. Once you add styles, though, it's a mess. Actually, I don't think anybody's really solved the UI for a WYSWIG styles system -- which is one reason why I use HTML, Textile, or Markdown where I can. I feel like I have much greater control over the styling if it's all in text.

NeoOffice -- Mac side word processor of choice, although I rarely use it for the reasons given above. I've gone through all the free or free-to-try alternative word processors on the Mac, and NeoOffice is the one I keep coming back to (especially since the newest version, which fixed some performance issues).

Outlook -- Windows email program. Possibly the worst styled text email editor around. I think it's the only one that, if you insert a paragraph in the middle of a quote from a previous email, keeps the quote formatting on your new text. This is, shall we say, not helpful.

PowerPoint -- I guess it's a text editor, of sorts. This is here mostly for a mini-rant... I know that everybody says that effective presentations should have minimal text on each slide. I even agree. But... in many environments, including lectures and a lot of corporate situations, the slides become a de facto deliverable to people who are unable to make the original meeting. If you don't have enough text for those people to follow along, they will get angry...

TextMate -- Mac side programmer editor of choice. I tried a number of different editors when I switched to Mac and realized that jEdit didn't really play nicely with OS X. I find the TextMate UI to be unusually clean, and it's the most powerful and extensible text editor not written in Lisp.

VoodooPad Lite -- Mac desktop Wiki application. I'd probably use this more if my daily work were Mac side. It's very nicely done.

That's a lot of applications. I'm pretty sure I missed some, at that. I think my point is that specialization is the way to solve the text UI dilemma. Still, needing 13+ apps to solve my basic text entry needs.... seems less than optimal somehow.

Monday, May 28, 2007

A Program Note

This is for the two or three of you that are subscribed to this blog via RSS feed -- I've just added a FeedBurner feed, and if it's not too much trouble, it'd be great if you could switch over to it at:

The "Subscribe" link in the sidebar will also work.


Friday, May 25, 2007

An Agile Musing

Of course, since I muse in an agile way, I reserve the right to change my mind based on future developments...

Software development usually takes place in a complex environment where your goal can change quickly. In general, there are two ways to deal with a complex environment. One is to try to anticipate, in advance, every possible permutation you might need to deal with, and the other is to manage your environment with the flexibility to respond to new challenges with minimum effort. Software is just a specific case where that challenge plays out.

Of course, being software engineers, we've given these approaches names. They are nicely defined by James Bach:

agile methodology: a system of methods designed to minimize the cost of change, especially in a context where important facts emerge late in a project, or where we are obliged to adapt to important uncontrolled factors.

A non-agile methodology, by comparison, is one that seeks to achieve efficiency by anticipating, controlling, or eliminating variables so as to eliminate the need for changes and associated costs of changing.

It seems worth pointing out that, at least in theory, there's no reason why you can't both make a reasonable effort to find potential changes up front, and still make your working environment as flexible as possible.

In practice, of course, this is kind of difficult to do well. There's a logistics problem -- the kinds of things you do when you're trying to control change up front generally involve creating a lot of models and documentation and the like. This is exactly the kind of thing that works against minimizing the cost of change. But it seems to me that a group determined to do some up front analysis of what will largely be an agile process could manage to work around that.

The bigger problem is an issue of mindset -- if you think that you can solve all your design problems up front, then you start to look at any change to the design or implementation with suspicion. At that point, any change is, by definition, a mistake of some kind. Which leads to suspicion of all changes and can mean all kinds of tracking and procedural overhead that pushes up the cost of change. The sort of organic, bottom-up design that you get from test-first and tight iterations doesn't fit in this model at all.

Although I should say that in my experience trying to get groups to adopt XP or agile practices over several different companies is that nearly all programmers and managers agree that automated tests are a good thing. Most programmers are at least willing to consider the idea of test-first, although getting somebody to do it consistently is tough (hey, it's tough even for somebody who's totally bought into it).

The resistance I do get tends to be around the idea that you can start programming with an incomplete design without it leading to disaster.

Waterfall-style up front design can be incredibly seductive. You're brainstorming with other smart people, you're solving all your problems before they even come up. It certainly feels like you are doing something vitally important to the success of your project. Even if you know that many of the decisions will later be revised in implementation, it still feels good to have that crisp UML diagram. You can't have bugs in a UML diagram. The issues that do get solved in design are assumed to justify the time cost of the design, because the assumption is that the time cost of later improvements will be much larger (which is true, in part, because of the amount of design work).

In contrast, starting coding with incomplete information can feel risky, especially the further you get from the actual implementation team. Any issue that gets changed later can be blamed on the relative lack of design, true or not, and the time savings from not doing as much design can be invisible. It's very hard to let go of the idea that all your design problems can be solved up front. But once you are willing to allow that some problems can only be solved in the moment, you're much better equipped to deal with the inevitability of change.

Wednesday, May 16, 2007

State of the Art

O'Reilly Radar has been analyzing the state of the computer book market on a quarterly basis for a couple of years now.

This link is to a drill-down into the Q1 2007 results for programming languages. The information is of some passing interest to me, both as an author and as language geek.

Things that jumped out at me.

  • Ruby is up a ton, and is now selling more than Perl and Python combined. There are now as many Ruby books in print as there are Python books.

  • Two of the top five books are Rails, with number one being PragProg's Agile Web Development with Rails. Two of the remaining five are Head First books.

  • But Javascript sells almost twice as many books as Ruby, and is also growing quite a bit. I still see the occasional thought that Javascript will escape the browser and become the language of the future. It certainly seems to be gathering the base.

  • The numbers have just under 10,000 Python books sold in Q1 2007. I don't have the wxPython book numbers for that quarter yet -- results to authors are delayed a quarter, but that means that the wx book was roughly 10% of the overall Python market. I can't decide if that surprises me or not.

  • It's funny how your personal perspective skews expectations. In the abstract, I know there have to be a lot of Microsoft shops out there, but I've never worked at one, so it's surprising to see C# and .NET so high on the list. I'm currently surrounded by a lot of IT Perl experts, so it surprises me a little to see Perl so low, even though it probably shouldn't.

Sunday, May 13, 2007

from internet import *

Three posts that caught my eye today.

Ruby School

Gregory Brown over on O'Reilly net has an article about using Ruby in Computer Science courses, at least in later algorithm classes. It's not a bad argument, but I think it'd be more convincing if the Ruby example was a little cleaner and easier to read compared to the pseudo-code.

Let's see... The last time I had to care about this issue was about eight years ago when my grad institution was going through a somewhat controversial revamp of the CS curriculum. The fight, as always, is between the theorists and the pragmatists. The theorists want to teach a lot of "pure" CS up front -- Turing machines, big "O" analysis, computational theory, that kind of thing. The pragmatists want the students to be able to get jobs.

You should know that I spent the better part of three years as part of a group working with an object-oriented class that we taught in Squeak Smalltalk. Lovely language, to be sure, but we had to spend part of every course explaining to some nervous students why we weren't teaching them C++ or Java...

At the time, the initial CS classes were moving to Java, with some relief. This is because a) nobody wanted to inflict C or C++ on unsuspecting new CS majors, and b) the previous most common language, Pascal, was woefully obsolete. Java is reasonably straightforward to teach and is actually used in real programs, both high points.

Personally, I think you can make a pretty nice case for a scripting language like Python or Ruby in the initial CS class. They are both pretty easy to get started with, the syntax is clean enough that algorithms are easy to visualize (which was Brown's original point). In Python you can do it without introducing objects (which most CS1 classes didn't do eight years ago, don't know if that's changed). In Ruby it's easy to teach meta-programming.


Paul Julius of ThoughtWorks about how CruiseControl can save you $12,535 per broken test. The money coming from the difference between the cost of fixing a bug immediately versus not catching the bug until integration testing.

I dunno. I love continuous integration, and would shout about it from the rooftops if they'd let me on the roof, and that number still sounds a bit more "look what I can do with numbers" than "look what I can do with Continuous Integration". But then I'm skeptical of nearly every numerical analysis of programming productivity.

Plus, Marshmallows

Over at Roughly Drafted, Daniel Eran goes on about the smooth, harmonious relationship between Apple and Sun. Naturally, I want to talk about one of his sidebars...

The name of Apple's Mac OS X frameworks was even named Cocoa in part to associate it with Sun's Java. The other reason was that Apple already owned the Cocoa trademark, having using it earlier for a children's programing environment.

You know, I've always wondered about that. The original Cocoa was a project that was being worked on in Apple's Advanced Technology Group the summer I interned there, plus it some buzz in Educational Technology circles for a while. Internally, it was called KidSim, but the name was changed to Cocoa when it was being prepared for release. Java was programming for grown-ups, so Cocoa was programming for kids. It seems like Apple isn't really using that connotation of the name anymore.

The project (now called Stagecast Creator) is a graphical rule-based programming language, something like a cellular automata program. The user specifies an initial arraignment of sprites on the screen, then specifies how that arrangement should change in the new time slice. Complex programs could be created with almost no typing (although, like all such programs, you still had to use drawing tools to create your own sprites -- that was still hard). Stagecast still seems to be around, although it's been ages since I tried the software. It was pretty cool, though.

Thursday, May 03, 2007

Comment On This

So the other day I'm looking over some code, and I see this... (slightly paraphrased to protect the innocent -- in the original, the declaration and the getter were, of course, separated.)

* The name of the user
private String m_userName;

* @return The name of the user
public String getUserName() {
return m_userName;

And I thought, "I really hope some of that was generated by the editor"

And then I thought, "This is why other languages make fun of Java"

And I finally ended up with, "Most of what beginning programmers are taught about comments is useless". At least that was true when I was in school.

For the moment, I'll put aside the Java issue, to talk more generally about comments.

First though -- we're agreed the example is absurd, right? In that it repeats that the user name is, in fact, the user name five times. Which is at least four more than strictly necessary (I realize that Java more or less forces some of this duplication).

I realize that this is hardly the worst coding sin you can commit, but when I look at code like that I do think that either the programmer isn't quite sure which parts of his code are most important or that the person is rigidly following an overly formal standard. Neither of which is all that flattering.

About 95% of the time, it seems like you either get no comments, or the kind of useless comments in the example. Given the choice, I'd rather have no comments -- it's less distracting, and you can see more code at once.

However, you can do comments effectively. Here's what I think, which should not ever be confused with what I actually do.

  • The best commenting is clear and accurate names for your variables, functions, and classes.

  • The "No Duplication" rule applies to comments at least as much as it does to code.

  • Possibly more, since no compiler is ever going to catch if your comments fall out of sync with your code.

  • Therefore, under normal circumstances, comments should avoid repeating what the code does.

  • However, limitations on input or output values that are not apparent from the code should be included in comments. If the user name was guaranteed to be under ten characters because of the underlying database, that would be a useful comment.

  • Rationales for choosing a particular implementation are often good comments, as is the code's place within the larger program.

  • If you find yourself commenting inline within a long method to explain what the next section does, odds are you'd be better of extracting that to a well-named method and skipping the comment.

  • There is a cost to commenting -- it takes time to do well, and it can be distracting. It also limits the amount of code you can read at once.

  • The most obvious exception to all of the above is when you are in a situation where people will read your comments without ready access to the source code. Writing an API or a framework, for example. In that case, most of the issue about duplication doesn't apply and you need to be descriptive for the benefit of users who will only see your JavaDoc or RDoc.

  • However, none of that excuses writing a one line comment for getUserName().