Notebooks

Tom Van Vleck

I once read the diary of Shakespeare's doctor and son-in-law, John Hall. He discovered the cure for scurvy: a kind of grass called scurvy grass that had vitamin C in it. He discovered this by giving his patients all kinds of garbage to eat when they were sick, and keeping notes about what worked. Think of some past software projects. We might have had to eat some garbage, but do we have any notes of who tried what?

The most important tool a software engineer should start with is a notebook. Just describing what really did happen, which combinations you tested, why you made some decision: facts like these are often lost completely a few days or hours after they happen, and the cost of doing without them can be waste and failure.

Sometimes the most important reader of your notebook is you, a month or a year later. It's surprising how often we have to learn the same lesson again, because we forgot what we learned.

Hall named the disease after its cure. That is, before his work, people didn't even know where that disease stopped and another began. How things are named is arbitrary: many diseases are named after their symptoms; plants and animals are named after their discoverers, or what they eat, or how they look; and so on. The important part is that things we want to treat separately have different names. If the names have an underlying regularity that helps us deal with the things they name, or predict the existence of other things we should look at, then the names help us do our job.

For example, consider the names of errors in IBM OS/360. Modules produce error messages beginning with strings like IGG202I. People used to complain about this, and say, "Why not name it something like 'bad_token'?" Well, the advantage to the alphanumeric code is that it contains some more information; an experienced person can tell that the error code came from the PL/I compiler, and that it's one of many possible errors that compiler could have produced in other circumstances. On the other hand, given 'bad_token', you can't deduce any of this.

(There's a downside to using systematic names, and that's when we start treating the names as important, and spending whole meetings arguing about which name to label something with. This is waste motion for sure. An even bigger danger is "reasoning from the name to the properties of the thing named" (is there a rhetorical term for this?), the swiftest road to confusion I know.)