Evil Packages. Attacks Targeting Package Repositories

Attacks that rely on setting up fake update servers are not as difficult to carry out as one might think. The main reasons are administrators’ carelessness and the absence of robust version-publishing processes, although occasionally we see astonishing attack vectors that are hard to anticipate.

GNU/Linux users often say that automated package management across distributions is one of the system’s strengths. It is particularly helpful for quickly patching vulnerabilities in the kernel and service applications. A report produced at the University of Arizona, however, shows that the package manager itself may harbor security flaws. Because of these flaws, fake mirror servers hosting packages for a given distribution can supply clients with older, vulnerable software releases. Setting up such a personal mirror server is relatively easy, as the researchers demonstrated in their paper.

According to the study’s authors — Justin Cappos, Justin Samuel, Scott Baker, and John H. Hartman — the package manager’s central role in the system means it should meet very stringent security requirements. Nevertheless, by exploiting vulnerabilities they discovered in the APT, YUM, and YaST package managers on GNU/Linux and BSD, a potential attacker can gain unrestricted access to the system: modify or delete files, create new directories, and even plant a backdoor.

Fake Servers

Although a package manager is not designed to connect to malicious servers, the verification performed by distributors when approving external mirror servers appears to be superficial. The researchers were able, with relative ease, to get their server added to the official mirror lists for Ubuntu, Fedora, openSUSE, CentOS, and Debian. They recorded connections from thousands of machines, including systems belonging to the U.S. Army and other government agencies. According to their observations, some distributions do check that the files on an additional server match the originals. However, an attacker can still configure a service to send modified content only when the request comes from selected client machines.

Attackers usually cannot tamper directly with digitally signed packages, because the installer will alert the user if a valid signature is missing. The authors note, however, that an attacker can distribute older packages with valid signatures that contain known vulnerabilities. A compromised mirror can also block security updates by returning a list of packages that are already out-of-date. Over time, this increases the number of unpatched vulnerabilities on the client system, which the intruder can then exploit in a targeted manner.

Trojan Developer

If a potential attacker cannot set up their own mirror server, they may instead try to impersonate a project developer — or even become one. By presenting themselves as helpful, the attacker can gain the trust of the team responsible for releasing new software versions and, at the opportune moment, smuggle malicious code into the project. In this scenario a high level of expertise is required, because the ruse may be uncovered if a contributor notices that part of the application looks suspicious. The malicious code therefore has to be written so that it appears to be an innocent mistake.

Fortify Software, a company that provides information-security products and services, has published research on a new class of attack. Cross-Build Injection (XBI) involves inserting malicious code into an application during its build process.

It may seem at first glance identical to the historical practice of developers leaving so-called backdoors in their programs. Such backdoors enabled a developer to take control of the application, bypassing the very safeguards they had implemented (e.g., by supplying a username and password known only to them). The difference today is that, in the age of automated build systems and the decomposition of the software-production pipeline into independent stages, far more people have access to the resources that govern how an application is assembled. Collaborative workflows common in many Open Source and Free Software projects, as well as in corporations, are a prime example.

Faulty Processes?

In this modern model, application development is usually a multi-stage process. The source code is stored in a version-control repository where a team of developers works on it. The next layer involves creating versioned source packages ready for compilation. This step is sometimes automated: a system is asked to generate an archive containing the latest code from the repository. That archive (e.g., a TGZ file) is then passed to the build and integration systems.

The following step is compilation — often fully automated. Build servers take instructions from a release manager and, following directives in the appropriate files, download the source package, unpack it, apply any necessary patches, and compile a package ready for installation. These build instructions are typically kept in a repository shared with project members; examples include .spec files used by the rpmbuild tool when creating RPM packages and build.xml files consumed by Apache Ant.

Depending on infrastructure and workflow, XBI attacks can take several forms. A project insider might commit malicious code to the repository, but such changes are usually easy to trace. A more plausible — and harder to detect — strategy is to substitute the archived sources before they reach the build systems. Whether that is possible depends on the access controls in place. Trust is central here: in some projects every developer possesses, “just in case”, credentials for the file server so they can quickly remove a faulty archive, but those same privileges make source substitution much easier for an attacker.

Thompson’s Hack

The following fragment was created thanks to the suggestion of Grzegorz Antoniak (PGP ID: 78737BF9).

Ken Thompson is one of the creators of the Unix operating system and was born in New Orleans. In 1983, together with Dennis Ritchie, he received the Turing Award for his contributions to operating-system theory, specifically for creating Unix. During the award ceremony he delivered a lecture entitled “Reflections on Trusting Trust¹”, in which he presented a backdoor technique now known as Thompson’s hack or the trusting trust attack. The written version of this talk is regarded as a foundational text in information security.

Thompson opened by saying he would not dwell on Unix, because most of its components had actually been written by other people, even though he often received the credit. He then highlighted Dennis Ritchie’s influence on his development as a programmer and began a three-part story about the cutest program he ever wrote.

Self-replicating Code

In the first part Thompson recalled his university days and a programming exercise: writing self-replicating code in as few lines as possible. This was not a virus but a program that printed its own source code to standard output.

He illustrated the idea with a somewhat verbose C listing that stored an almost exact copy of itself in a character array.

The Chicken and the Egg

In the second part Thompson discussed the chicken-and-egg problem using a C compiler written in C. He displayed the portion of the compiler responsible for handling special symbols (escape sequences, to be exact) in character strings, in this case the newline character:

1 c = next();
2 if (c != '\\')
3   return(c);
4 c = next();
5 if (c == '\\')
6   return('\\');
7 if (c == 'n')
8   return(n);


 c = next();
 if (c != &#39;\\&#39;)
   return(c);
 c = next();
 if (c == &#39;\\&#39;)
   return(&#39;\\&#39;);
 if (c == &#39;n&#39;)
   return(n);

It is worth noting that to build an executable from the code that contains this compiler fragment, we must start with a compiler that already understands escape sequences — characters whose meaning depends on the symbol that follows them (see lines 2, 5 and 6). The snippet relies on these sequences; without that prior support the code will not be portable.

Someone could extend support for yet another escape character simply by adding another condition, e.g.:

linenostart=9

1 if (c == 'v')
2   return('\v');


 if (c == &#39;v&#39;)
   return(&#39;\v&#39;);

During the initial compilation, the following warning appears:

warning: unknown escape sequence '\v'

This happens because the binary compiler processing the new compiler’s source does not recognize the character literal '\v'. It therefore skips the unknown escape sequence, emitting only the byte that follows the backslash.

The fix is to modify the compiler in two steps:

Replace the character literal with a numeric literal — for instance, return the magic value 11 via return(11); instead of return('\v');. After this build we obtain a new binary compiler that does recognize the \v escape, but it is still not portable: the hard-coded value 11 may map to different run-time representations depending on the target architecture and type system.
The transitional compiler can now rebuild the original source (the version that returns '\v') to produce a fully portable binary with proper support for the new escape sequence.

Trojan

In the final part of his story Thompson presented a scenario analogous to the above, but this time the example involved the compiler routine that processes a single line of source code into which someone had slipped a trojan. The malicious routine detects whether the current line contains, say, the code that enforces password checking; if so, it silently injects a few extra instructions — for example, a sub-routine that permits access with a universal password.

Planting such code directly in the compilation function would almost certainly alarm other programmers who can read the source. Thompson’s method, however, hides the payload in several phases.

There are two malicious components and two compilation stages. In the bootstrap stage the compiler contains:

the first routine, which injects a backdoor into selected programs (altering the compiler’s behaviour);
the second routine, which copies both itself and the first routine into any new binary produced when the compiler rebuilds itself.

The second routine is essentially the self-replicating source discussed earlier. After the first recompilation, the plaintext malware can be deleted from the repository; the binary that is now baked into the compiler will reproduce itself – along with the backdoor injector — every time the tool is rebuilt. From that moment on, the compiler remains a trojan indefinitely, unless someone uses an earlier clean release (from before the malware was introduced) to compile a fresh version. Even that may be non-trivial because of the bootstrap dependencies illustrated in the escape-sequence example.

This problem is not confined to compilers. Any programming library that must be built with a prior version of itself is vulnerable. The essence of the issue is the unexamined assumption that a critical component of a large system is always safe and therefore can be trusted. What conclusion should we draw?

No system component
can be endowed with
unlimited trust.

Configuration Attacks

Let’s return to more straightforward examples. Another critical point vulnerable to XBI attacks is the set of build-instruction collections — all the scripts and configuration files for tools such as rpmbuild, ant, make, automake, and so on. Inside these files you can tell the build tool what extra actions to perform or where to download the sources.

Whether an attack succeeds depends on where those instruction files are kept. Sometimes they live alongside the application source code, but in larger projects or GNU/Linux distributions they are placed in separate version-control repositories.

Combined Attacks

A more serious scenario is a hybrid attack in which an outsider injects malicious code into a library or application. An external intruder will usually try to compromise the server that builds new releases or the system that stores the build scripts. Attacking the main source repository is less attractive, because it leaves obvious traces.

One interesting vector targets the DNS service used by the build server. If the intruder knows where source archives are fetched from, they can spoof the relevant DNS record and force the builder to download a tainted archive-one that contains a trojan or backdoor.

Research indicates that public projects are most at risk: many contributors have write access, and building and publishing are highly automated.

Large private projects that do not expose their build hosts to the public can still be vulnerable to combined attacks, especially where the division of responsibilities is not mirrored by strict access controls-for instance, when programmers, testers, and release engineers all have full privileges on every system.

Protection

The authors of the report propose several counter-measures for attacks on package mirrors. Users should stick to mirrors run by trusted operators. Unfortunately, the report does not specify how to verify that trust: for example, Ubuntu’s mirror list contains many little-known companies. In Poland, by contrast, several reputable universities operate mirrors; administrators should confirm that these servers meet basic security requirements.

Mirror operators should also update packages manually — and promptly — whenever new versions appear. Many of these pain points are addressed by Stork, a package manager built on the security architecture the researchers designed. Whether — and when — it will reach GNU/Linux distributions remains an open question; an active discussion is under way on lwn.net about the researchers’ findings and alternative solutions.

As for preventing malicious-code injection during development, open projects have no silver-bullet defence. In the free-software world it is hard to police contributors with bad intentions. When an attack undermines trust, the remedy is behavioural analysis across different software releases. With a compiler, for instance, you can test whether the binaries produced from identical source diverge when built with different compiler versions.

Evil Packages

Attacks Targeting Package Repositories