Seems Boeing has one.
In a safety-critical system, too, but at least the nature of the bug is such that it won't happen in a normal operation cycle.
Basically: an internal counter overflows at 231 centiseconds from last power-on, and the controller gets a wedgie.
Given that an airliner is supposed to receive a major overhaul at intervals of rather less than 8 months, and that ordinarily results in everything being power-cycled, this isn’t likely to be a problem, but it's the sort of thing that gets worry-inducing, given the consequences of a (synchronized!) failure.
Remembering the evolution of AGROS...
I started with an unsigned 32-bit millisecond counter, not for actual timekeeping but for handling the sort of relative scheduling that's common in real-time systems. This wraps every 1193 hours.
Then I started having to keep event logs, tracking cumulative runtime without regard to power cycles. For this I added an unsigned 32-bit decisecond counter, good for 13.6 years, which is all the power-on time the product in question is rated for. (Actually, the current firmware for that class of product has its own decisecond counters for power-on time and various sorts of operation time, so I should probably weed this particular artifact out of the AGROS system code, assuming AGROS itself has any future at this point.) The 32-bit decisecond thing was a compromise among log storage space (originally a rather small EEPROM), resolution, and covering the life of the product.
When I added TCP/IP, I felt the need for a sub-decisecond counter that absolutely, positively would not wrap during operation, so I added an unsigned 64-bit microsecond counter (which in practice is incremented by 1000 every millisecond). This has a range of 548 kiloyears. In most cases, it's time since last power-on, though it could also be cumulative runtime or even time since the beginning of the Common Era.
But storing a 64-bit value takes up a bunch of EEPROM space, so the logs still use the old decisecond counters.
Compromises. They're everywhere.
The FAA service bulletin is that every four months they should try turning it off and back on again (insert IT Crowd joke here).
Posted by: Jeff Bell | Saturday, 02 May 2015 at 12:34