Mind Matters Natural and Artificial Intelligence News and Analysis
cyber-network-data-flow-open-source-3d-illustration-of-digital-hi-tech-particles-stockpack-adobe-stock
Cyber network, data flow, open source. 3D illustration of digital hi-tech particles
Image licensed via Adobe Stock

The Backdoor to Control the Internet

We almost lost the Internet last week, but open-source developers saved the day.
Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

Few people are aware, but over the last several days, a perceptive developer foiled a multi-year plot to install a remote backdoor into, well, the entire Internet.

Two years ago, a programmer known as Jia Tan (JiaT75) started helping out with a lesser-known compression library, known as xz. For those who don’t know, software today is not a monolithic entity. Every piece of software you use it built from a collection of tools, known as libraries, that make programming easier. For instance, most programmers never have to write the specifics of a sorting algorithm, because, somewhere, there is a library which performs sorting for them. This leaves programmers to focus on higher-level tasks, like actually making the software do what the users want. However, these libraries don’t come from nowhere — they require interested programmers to maintain and expand their functionalities. This is a lot of work, so when new programmers volunteer to help, it is a great relief.

However, Jia Tan’s motives were less than pure. While xz is not directly utilized in a lot of software, it is pulled in by some other libraries, which are then utilized by still other programs. In particular, the popular remote login service sshd, used by system administrators everywhere, can optionally include a tie-in to a third-party library, which then also includes the xz library. And this is the configuration used by most server operating systems on the Internet. So, as Jia Tan gained credibility with the xz maintainers, Jia also indirectly gained increasing access to other parts of the operating system.

Software often includes code to test itself. This is how software maintainers prevent themselves from creating obvious bugs when making changes. The test code usually isn’t incorporated to the final software that is shipped. Jia Tan included a backdoor in a file that was ostensibly used to test the compression techniques of xz. However, a modification of the build system causes this test case with its backdoor to get combined into the final software, which is then deployed. The backdoor works by overwriting standard encryption/decryption functions with its own version of these functions.

This software had already gotten merged with some test versions of several standard operating systems, so developers on the bleeding edge had already started to use it. It was discovered because a Microsoft software engineer, Andres Freund, was doing some performance tests. To do these tests, he was trying to minimize how much CPU time the rest of the tools on his system were using. He noticed that sshd was using an inordinate amount of processing power, so he started to dig into what was causing the performance hit. His analysis tools showed that sshd was spending a lot of its time in the xz library, and still further investigation revealed that the xz library had replaced some of the standard encryption and decryption functions.

Thankfully, this was discovered before it had major impacts. Nobody knows who Jia Tan is, and we are probably never going to find out. But this does put software developers on alert to the fact that bad actors, whether individuals or part of a state organization, are willing to play the long game to get malicious software installed on everything.

Some have used this incident to criticize open-source software, saying that this is part of the problem — i.e., if it weren’t for everyone relying on open-source software, this would not have happened. But honestly, I think the opposite is true. The only reason this was found was because of open-source software. It is because we have developers who are familiar with the code base not only of their own software, but of everything their software runs on, that we were able to find and diagnose the problem so quickly. I’ve used closed-source software on occasion, and there is zero transparency in such situations. If someone introduced malicious code into an important closed-source piece of software, nobody would know, and nobody would even have the ability to find out. If it was discovered, the company in charge would probably try to avoid disclosure about the extent of the problem. However, as it was open-source, there is a transparent record of everything that happened — every message, every commit, every artifact that was uploaded. Everything can be inspected and examined, the full extent of the damage can be determined, and the root cause can be debated in public so that everyone can be on guard.

In short, we almost lost the Internet this week, but for the careful eye of open-source software developers.


Jonathan Bartlett

Senior Fellow, Walter Bradley Center for Natural & Artificial Intelligence
Jonathan Bartlett is a senior software R&D engineer at Specialized Bicycle Components, where he focuses on solving problems that span multiple software teams. Previously he was a senior developer at ITX, where he developed applications for companies across the US. He also offers his time as the Director of The Blyth Institute, focusing on the interplay between mathematics, philosophy, engineering, and science. Jonathan is the author of several textbooks and edited volumes which have been used by universities as diverse as Princeton and DeVry.

The Backdoor to Control the Internet