Today my scanner would not respond, saying another program
had the scanner in use, reporting this to me in an error message. I closed every program I thought might use the
scanner, no change. The error message
still maintained that the scanner was in use, and advised I reboot my
computer. It occurred to me I might just
reboot the scanner, and when I did, the error went away and I was once again
able to use it.
This is an example of countless times we know how to make a problem go away without actually knowing what the problem is. I often think that most computer problems
have very specific causes, like the old cartoon about the mainframe dying every
night at a particular hour, confounding technicians, simply because the
cleaning lady unplugs it to plug in the vacuum.
It seems likely to me that my scanner was not in use at all,
but that some data bit had tripped to give the appearance that it was. Some technicians delight in tracing such
errors down, and are often paid handsomely for their efforts. Still I wonder why application code doesn’t
report errors more reliably and with greater accuracy.
In the old days, technicians poured over “stack overflow”
reports which left a trail of… well, everything a computer was experiencing at
the time of a system crash. No one has time
for that anymore, so we rely more heavily on error messages to tell us what went wrong.
I feel there’s a lot of room for improvement in error
messaging, and it might be interesting to actually know what’s happening when
things go wrong.
Of course, our software
applications and operating systems are vastly more complicated than they once
were. Perhaps there’s no time to test
every conceivable interruption, and to neatly alert us what’s wrong. But if each step in a software program
verified what it was doing, in theory, when it was not able to perform that step,
it could throw report an error that would state clearly what it was unable to do. It might even suggest the cause of its
problem, identify a key antecedent to it’s action, or other programs that were
interfering with its operation.
Persons more interested in all this have addressed
everything I just mentioned. But I sense
that error reporting — and recovery from error — could be much more robust than it is.
Perhaps the reason error recovery isn't more robust is that it’s just easier to restart the
program.