It would make a great film script, if it hadn’t actually happened of course…
INT: APOLLO 11 LUNAR MODULE, July 20th, 1969
Two astronauts are intensely focused…
“Got the Earth straight out our front window.”
“Sure do. Houston, [I hope] you’re looking at our Delta-H.”
Armstrong is focussed on landing onto the Moon’s surface. He has just 30,000 feet to go…
INT: HOUSTON MISSION CONTROL
Charlie Duke and backup crew members Jim Lovell and Fred Haise in Mission Control during Apollo 11’s descent.
The whole room in Houston Mission Control are listening to the exchange whilst monitoring the numbers on their screens, looking for any little anomaly that could force the mission to be aborted.
INT: APOLLO 11 LUNAR MODULE
The Priority Display interface, part of the Apollo Guidance Computer, displays a 1202 alarm indicator.
“It’s a 1202. What is that? Give us a reading on the 1202 Program Alarm…”
The display and keyboard (DSKY) interface of the Apollo Guidance Computer mounted on the control panel of the command module, with the flight director attitude indicator (FDAI) above.
Neither of the astronauts had seen a 1202 program alarm in training – most of the simulations had displayed other alarms, most of which had them hitting the abort button. The 1202 Program Alarm indicated a data area overflow issue.
The Apollo Guidance Computer reboots! All jobs are cancelled regardless of priority then restarted. This happens quickly enough that no guidance or navigation data is lost. However, this issue persists as the system is still overloaded for some reason. Eventually, when the second 1202 alarm is displayed, Buzz Aldrin notices a correlation…
INT: APOLLO 11 LUNAR MODULE
“Same alarm, and it appears to come up when we have a 16/68 up”
Getting to the Root Cause
The 16/68 code (Verb 16, Noun 68) was used to display the range to the landing site and the velocity of the Lunar Module. The command in itself didn’t place a heavy load on the computer, but with the root cause of the additional load still unknown, the extra processing needed for the 16/68 code was enough to trigger the 1202 alarm. A total of four 1202 alarms and a related 1201 alarm were produced.
Realising that the cause of the issue seemed to be the running of the 16/68 code, the solution was obvious – simply ask Houston for that data instead of using the computer! Houston, in the meantime, gave Apollo 11 the GO in spite of the alarms. The navigation data was being saved as the alarms were far enough apart. Had they been closer together, the data would have been lost.
INT: HOUSTON MISSION CONTROL
“We’re GO on that alarm.”
“Eagle, looking great. You’re GO.”
Several minutes later…
INT: APOLLO 11 LUNAR MODULE
“Houston, Tranquillity Base here. The Eagle has landed.”
So, I guess everyone knew the story and that it all ended very happily. So, what are the three things Federos and Apollo 11 have in common? I see some amazing similarities between what the software in the Apollo Guidance System was doing and what we do today with hardware many thousands of times more powerful and software so advanced it can learn on its own. So let me explain a bit more and introduce one amazing Lady – Margaret Hamilton.
In August 1961, NASA issued its first contracts for the Apollo programme. They signed the Massachusetts Institute of Technology (MIT) to develop the guidance and navigation system for the Apollo spacecraft. A few days after the contract was signed, a computer programmer named Margaret Hamilton celebrated her 25th birthday. She heard about the Apollo project and jumped at the chance to be involved.
Margaret Hamilton, lead Apollo flight software engineer, in the Apollo Command Module.
The on-board computer built by the team at MIT was the world’s first digital portable general-purpose computer. Two Apollo Guidance Computers (AGCs) were installed – one on the Command Module and one on the Lunar Module. They each weighed about 70 pounds and contained around 76kBs of memory. They used core rope memory, a type of read-only memory that was made from wires woven through magnetic cores.
Continuous Learning and Priority Displays
The software the computers ran on was written by Margaret and the Software Engineering Division of MIT (by the way, Margaret is widely credited with coining the term ‘software engineering’.) The team focussed on prevention of errors at every stage of the design and development of the software. Every Apollo mission was built upon the knowledge gained from prior missions, learning from mistakes and coming up with new solutions. The objective was to deliver error free software. Acutely aware that the lives of the astronauts were at risk, she insisted on rigorous testing. “There was no second chance. We all knew that.” To this day, no bug has ever been found on the onboard flight software of any of the Apollo missions!
As you can imagine, trying to land on the Moon 50 years ago was not exactly a simple process to complete. There was a lot to do and a lot going on. Apollo 11 was the first mission in which the software worked in conjunction with what was called a ‘Priority Display’. This allowed the software to interrupt the astronauts in case of a critical problem and, as such, it was the very first critical event visualisation solution to be designed and used.
So, to close the Apollo story, what was the cause of the 1201 and 1202 errors? Just after the Lunar Module entered its orbit around the Moon, the crew turned on their rendezvous radar to track the command-service module in case they needed to abort the mission. The radar was sending data to the computer and the additional load caused it to overload its memory causing the alarms. It was thanks to Margaret Hamilton’s software that the computer not only alerted the astronauts to this but went through its error detection process, rebooting and dumping less important tasks and focussing on the highest priority jobs, such as steering the descent engine and providing landing information.
“The Priority Displays gave the astronauts a go/no-go decision, to land or not to land,” Hamilton has explained. “With only minutes to spare, the decision was a ‘GO’ for the landing.” The rest, as they say, is history. Now, back to those three things Federos has in common with Apollo 11…
- Data Volume and Velocity: Although on a totally different scale, the challenge of too much data was as much of a problem 50 years ago as it is today. In 1969 it caused the memory overflow and the 1201/1202 alarms. Nowadays, we commonly see the huge volume and velocity of events being produced by an ever more complex environment impacting the ability of companies to effectively manage and operate their services.
- Correlation and Root Cause Analysis: Once you have the data, you then need to understand it. Without understanding it, you cannot identify and fix the root cause issue. Today we use advanced correlation and root cause analysis techniques that leverage machine learning and automation to help us with this – 50 years ago, it was a hard-copy flight manual, a lot of human expertise and some very effective software that coped with error processing.
- Machine Learning and Intelligent Visualisation: For Apollo, the Priority Display was ground-breaking and mission saving – it provided the very first critical event visualisation solution. We take the same drivers used for that concept and design into our solutions today by showing the user only what they need to know to understand the issue. As mentioned above, we use a lot of advanced techniques to be able to accurately identify and present a single root cause event to the user so that they can focus on the critical and not the noise.
I’d like to add one other thing we have in common….
- Collaboration, dedication and innovation: NASA has estimated that around 400,000 men and women were involved in the Apollo programme – from astronauts to caterers, software engineers to doctors – many very smart people and all focused on the objective. Federos doesn’t quite have that number of staff, but we share the same ethos that Apollo did with our dedication and focus on delivering solutions that make a real difference to our customers.
The technology has come a very long way in the last 50 years but maybe fundamentally our objectives have remained the same. Focus on the prize and deliver!
If you would like to know how Federos solutions can help your organisation focus and deliver, please contact to arrange a chat with one of our experts and to see a demonstration.
All photos courtesy of NASA [Public domain] Wikimedia Commons