Published in RISKS Volume 17 : Issue 18 Thursday 15 June 1995
Ward Anderson at ACTC just reported an interesting crash on Multics (10.2) at ACTC -- Collection 1 initialization discovered that I became 45 years old Tuesday past, an event which was extremely unlikely, and crashed the system before the clock did damage to the file system, or so it feared.
The code in scs_and_clock_init.pl1 is perfectly clear - the time
"06/06/95 18:31 est Tuesday"
is hard-coded in, in characters, with the comment that it is "Bernard S. Greenberg's 45th birthday". It has been there for twenty years in plain text visible to anyone reading the code! (I loved to read code in my day, especially initialization - perhaps I was the last?)
Maybe Tom Van Vleck remembers, but it is extremely likely that twenty years ago at CISL our operator at the time for the nth and last time forgot to set the clock, or set it poorly, and damaged the file system (which looks quite askance on "back to the future" jaunts), and Tom and I said "This has to end. We have to put a gullibility check in the clock init code", and I did this. Probably saved a lot of file system damage over the years. If I had it to do over again, I'd do it over again! This code did the right thing!
At 25, I could not imagine I'd ever be 45, let alone that scs_and_clock_init.pl1 would be there along with me! Somehow, though, 65 doesn't seem that far away any more...
As Ward said, this is a real Multics story.
As I recall, the test was put in not due to the operator entering the wrong date, but after a time when hardware problems caused a totally screwy clock value. The clock "picked a bit" and jumped way forward, and operations shut down instead of crashing.
(Multics used the clock setting to seed the unique ID generator. Thus the clock must never run backwards, because two segments with the same UID would cause segment control to malfunction. We put in code to note the "last time seen" in the RPV label and to refuse to boot if the clock appeared to run backwards. Forward jumps wouldn't hurt the file system but of course then you would have to have a backward jump to correct it.)
I think I remember watching Angelo Grieco, an expert Multics system operator, look at the calendar in April and type in 05 for the month.. requiring us to crash the system and run the salvager on all directories in order to correct the bad times.
During NSS development, Bernie and I tried to design some checks to prevent these problems, for instance a "clock has jumped forward > 3 days since last known time, is this really right?" message, and could not agree on the value of 3. For example, some sites shut down over Xmas/NewYears, so imagine that the operator comes in Jan 2 with a hangover, boots the system, types wrong year, system fusses, he doesn't read the question, knows the answer is YES. Asking "are you really really sure?" is no help. Asking "add the digits of year, month, day & type the sum" is no help, keeps it from being scripted though.
The 645 hardware clock cost probably a million dollars: had a crystal in an oven, fancy stuff for the day; even the 6180 clock in the SCU was very expensive. But none of these clocks kept time when the system was powered off. We asked for a $3.00 digital clock chip in the console (these were just becoming available in the mid-70s) and were told that by the time LISD engineering finished with it, it would cost millions. We ended up putting "gullibility checks" (think I first heard this term from Don Widrig in the 60s) in the code in an attempt to screen out obviously wrong values.