Multics > People > Stories
26 Nov 1995

Message Coordinator

Multics Close ⊗

Tom Van Vleck

MIT Multics Machine room, bldg 39

Dave Jordan & Roger Roach at the 6180 console, MIT building 39, third floor, mid-1970s. Photo by THVV. Click for a larger view.

We can learn from the problems we had building Multics as well as the successes. Here is one such story from the early 70s.

Multics boots up by initializing the supervisor and then patching the OS tables to make it appear that one process is running. This process, the initializer process, calls out to ring 4 and runs the answering service, the code that listens to idle terminals for logins.

In the 60s, the initializer process also listened for operator commands. Operator communication and log messages came out on the initializer terminal. The other daemon processes such as backup were logged in manually by the operators at system startup time, each one on its own terminal. We had a row of TTY37s in the machine room with signs on them saying which process was which. Small sites didn't want to have one terminal per daemon; some daemons never said much, and terminals were expensive and operationally clumsy. Big sites found that the output speed of the initializer and daemon terminals could be a bottleneck, and wanted to filter and redistribute output.

In the early 70s, I wrote an event-driven multi-threaded program called the message coordinator that could handle many operator terminals and distribute messages among them according to routing tables, and a terminal DIM called mc_tty_ that replaced tty_ for the daemons, and connected its input and output to the message coordinator via a shared segment. A small site could run all the daemons on a single terminal; a big one could spread the messages from many process over many terminals. Operator commands could be entered from any message coordinator terminal, subject to access controls.

The message coordinator worked just fine in testing, but when we installed it at the MIT site, it couldn't keep up with the message traffic under full load. The queue built up and up, and the operators couldn't control the system. We had to crash and pull the message coordinator out until we could rework it to be much more efficient.

I learned two useful lessons. First, worry about performance. I hadn't even thought about the possibility of this failure mode. Second, test under realistic load.

Message Coordinator source at MIT.


Multics initializer cartoon by Angus Macdonald