The First Bug on Mars


In 1971, the USSR delivered the first planetary rovers on skis to Mars, whose task was to puncture the surface with a rod (housing a dynamic penetrometer and a radiation densitometer) to see if Mars was solid or liquid dusty. The first probe crashed on November 27; the second soft-landed on December 2 but didn't manage to get out of the "shell" of the lander, so that attempt didn't count.

This article was originally published in Russian on habrahabr.ru. The original and translated versions are posted on our website with the permission of the author.

25 years later

On July 4, 1997, the U.S. probe arrived at Mars and brought a "sojourner" with the first bug.

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image1.png

Image from sci-fi film "The Martian". The main character is carrying the Sojourner rover

The mission was at risk, but the powerful debugging functionality provided by the operating system, and professionalism of the programmers back on Earth (the guys did know their subject) enabled NASA to fix the bug in a short time.

Sojourner

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image2.png

The mission's cost was relatively small — $265 million.

The rover operated for 83 sols.

The rover's name, "Sojourner", originates from the Bible, where it means "traveler", and was selected in an essay contest won by V. Ambroise, a 12-year-old from U.S. state of Connecticut. It is named for abolitionist and women's rights activist Sojourner Truth.

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image3.png

Mission results:

  • 2.3 billion bits of information
  • 16,500 images taken by the lander
  • 550 images taken by the rover
  • 15 chemical analyses of rocks and soil
  • plenty of meteorological data
  • food for thought for software testers

Priority inversion

Priority inversion occurs when two or more threads with different priorities start competing for CPU resources.

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image4.png

The lander was carrying a radiation-hardened IBM Risc 6000 Single Chip (Rad6000 SC) 20 MIPS CPU with 128 Mbytes of RAM and 6 Mbytes of EEPROM. The operating system used was VxWorks.

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image5.png

The rover employed a 0.1 MIPS Intel 80C85 CPU with 512 Kbytes of RAM and 176 Kbyte of flash memory solid-state storage.

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image6.png

Three tasks with different priorities waiting around on the 1553 bus.

When collecting meteorological data, the system hung and started to reset repeatedly. The engineers on Earth ran a duplicate of the software and got down to work figuring out what was wrong. After 18 hours of studying detailed logs, they found the cause of the malfunction.

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image8.png

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image9.png

They only had to fix a couple of mutex flags.

How the bug was fixed

No, we did not use the vxWorks shell to change the software (although the shell is usable on the spacecraft). The process of "patching" the software on the spacecraft is a specialized process. It involves sending the differences between what you have onboard and what you want (and have on Earth) to the spacecraft. Custom software on the spacecraft (with a whole bunch of validation) modifies the onboard copy. If you want more info you can send me email.

— Glenn Reeves, team leader of Mars Pathfinder software developer team

Those interested in details were invited to email the software author at glenn.e.reeves@jpl.nasa.gov.

How the patch was uploaded?

VxWorks contained a C language interpreter to execute statements on the fly during debugging. The JPL engineers decided to launch the spacecraft with this feature still enabled. A short C program was uploaded to the spacecraft, which when interpreted, changed the values of the mutex flag for priority inheritance from false to true. No more system reset occurred!

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image10.png

Glenn Reeves, the engineer who found and fixed the bug, with a Mars Pathfinder duplicate in the background

The bug was found in preflight testing on Earth but was given a low priority.

Details

https://import.viva64.com/docx/blog/0462_Bug_on_Mars/image11.png

A presentation by a Chinese expert

http://www.slideshare.net/jserv/priority-inversion-30367388

Conclusion

Glenn Reeves is very thankful to the engineers at Wind River for developing an operating system that enabled remote debugging even in emergency conditions like those that occurred during the mission. Interestingly, the bug was known to the engineer team, but there are "deadlines" and "priorities" that force mission leaders to launch spacecraft, being aware of unfixed "weak spots".



Use PVS-Studio to search for bugs in C, C++, C# and Java

We offer you to check your project code with PVS-Studio. Just one bug found in the project will show you the benefits of the static code analysis methodology better than a dozen of the articles.

goto PVS-Studio;



Bugs Found

Checked Projects
410
Collected Errors
14 111
This website uses cookies and other technology to provide you a more personalized experience. By continuing the view of our web-pages you accept the terms of using these files. If you don't want your personal data to be processed, please, leave this site. Learn More →
Accept