Delta Airlines: On Second Thought, the Computer Crash Was Our Fault

After having been called out by Georgia Power, the utility that delivers electric power to its Atlanta hub, Delta Airlines finally came clean. It admitted that the crash of its computer network at 2:30 a.m. EDT on Monday, 8 August, had nothing whatsoever to do with the power company—after Georgia Power came forward and confirmed that none of its other customers in that area had experienced a power outage.

With that narrative revealed to be merely a canard, Delta says it has discovered the real root of the computer crash: One of its own power control modules went bad, allowing a surge that tripped circuits feeding the computer network that handles critical data including reservations, boarding passes, the matching of planes with the appropriate gates, and the roster showing which crew members are staffing each flight. The network is supposed to instantly switch over to backup systems. But as tens of thousands of stranded passengers have learned over the past few days, results may vary.

“Critical systems and network equipment didn’t switch over to backups,” Delta chief operating officer Gil West said in a statement. “Other systems did. And now we’re seeing instability in these systems.” So much so that roughly 800 flights were canceled on Tuesday.

“This makes my point that that backup failure is a recurring pattern,” says Robert Charette. Earlier this week, he told IEEE Spectrum, “What you’ll see in reviewing [these airline computer system failures] is recurring problems with infrastructure (i.e., power, networks, routers, servers, etc.) that seem to keep surprising the airlines. In every case I can recall, there were backup systems in place, but they failed—another recurring theme.”

In a video message posted on the airline’s website on Tuesday, Delta CEO Ed Bastian said, “This isn’t who we are.” He added, “Over the past three years, we have invested hundreds of millions of dollars in technology infrastructure upgrades and systems, including backup systems to prevent what happened yesterday.” That’s an IT project that would likely fit with the theme of last October’s IEEE Spectrum special report, “Lessons From a Decade of IT Failures.”

But according to Bastian, passengers can rest assured. “We’ll do everything we can to make certain that this never happens again,” Bastian said.

[Source: Spectrum]