Toolchain Necromancy: Past Mistakes Haunting ASLR

By Mathias Krause

March 4, 2024

Introduction

There was a nice blog post by Justin Miller in January that showed how a performance optimization can lead to weakened security by making bruteforced memory corruption attacks more likely to succeed. If you haven’t read that article yet, please go ahead to get the reasoning behind the creation of this one.

After having read the article ourselves, we gave the kernel part a closer look, seeing if we can do something about the issue. Turns out, fixing the kernel was easy, so we just went ahead and did it the “PaX way” – making use of a “secure by default” reasoning with the possibility to mark exceptions via PaX flags where needed.

To verify if our changes were working as intended, we needed some tests and PaXtest came to mind. We extended it with relevant tests, but as it turned out, there was more involved than just libc.so not getting randomized properly.

What Justin wrote about in his blog isn’t the full story, there's more to it. It’s not only the kernel that plays a role, it’s also the runtime linker and ultimately the linker used to create the affected binaries.

The Suspects

Before getting into the gory details, we have to introduce the involved components. Let's dig in!

Suspect #1: The Linux Kernel

In Linux v5.10 a commit was added to the Linux kernel that changed the behaviour of the in-kernel ELF loader to honour segment alignment that is bigger than the system’s regular page size. It’s a different commit than what was mentioned in Justin’s article, as this one is specifically about the ELF loader and not about adding transparent huge-page support for filesystems. It’s somewhat related, but still a completely independent change.

The reason behind the change was, of course, performance gains. Hugepage-aligned mappings can make use of huge pages for the backing memory which should perform better because it uses fewer TLB entries (only one to cover a 2MB or even 1GB range on certain x86-64 systems) and therefore also requires fewer page table walks when accessing previously untouched data. It does, however, have a bigger setup time, as the full page (either 2MB or 1GB instead of only 4kB) needs to be populated. But if the full range is accessed, it can also save the overhead of triggering and handling page faults for demand loading/populating the memory region.

There have even been attempts to extend the change to cover static PIEs and the ELF program interpreter in v5.17 but these had to be reverted (here and here) in v5.18 because of regressions.

This is what the log of the original commit says:

The current ELF loading mechancism [sic] provides page-aligned mappings. This can lead to the program being loaded in a way unsuitable for file-backed, transparent huge pages when handling PIE executables.

For binaries built with increased alignment, this limits the number of bits usable for ASLR, but provides some randomization over using fixed load addresses/non-PIE binaries.

Tested by verifying program with -Wl,-z,max-page-size=0x200000 loading.

The part that says “limits the number of bits usable for ASLR” is actually the interesting one. Obviously, it was very well recognized that this change can weaken ASLR. But, apparently, back then it was assumed that this behaviour needs to be explicitly requested by passing the -Wl,-z,max-page-size=0x200000 linker flag during compilation. While that may have been true for the author’s system, it turns out, the real truth isn’t that simple, which brings us to our next suspect.

Suspect #2: binutils

Binutils is a set of binary utilities – who would have guessed? 😉 – to create, transform and analyze binary files. One of its tools is ld, the linker used by the compiler to combine multiple object files to create executable programs or shared libraries.

The previously mentioned compiler option -Wl,-z,max-page-size=0x200000 will be passed on by the compiler to the linker as ‑z max-page-size=0x200000 to change its behaviour according to the default segment alignment. But, as this option only changes the default behaviour: what is the default behaviour of ld when that option is missing – like in 99.9% of the regular use cases?

To answer this question, we need to dig a little bit into binutils’ history. In particular, we'll be tracking the evolution of MAXPAGESIZE, as it implicitly defines the segment alignment for ELF files created by the linker.

We’ll limit our time-traveling excursus to only x86-64 to narrow focus and make it easier to follow, as, despite intuition, the linker changes a lot! Feel free to skip ahead to the summary if you’re not into source code archaeology.

Turning Stones

Our journey starts in the year 2000 with binutils v2.11, when commit 2be3aa031f3d added support for the x86-64 architecture, making use of a MAXPAGESIZE of only 4kB1.

The following year, in binutils v2.12, commit 35714f2a5daa changed this to 1MB. It was no security issue back then, as the kernel ELF loader didn’t care nor were CPUs available to actually run x86-64 code on.

The value of MAXPAGESIZE was further changed five years later for binutils v2.18 in commit f7661549c793 to its official value of 2MB, making it match the architecture’s (only, at the time) huge page size, but, at the same time, making the latent issue only worse by consuming yet another address bit.

Nothing really changed in regard to MAXPAGESIZE until 2018 in binutils v2.31, when commit f6aec96dce1d added a configure switch --enable-separate-code to enable the -z separate-code linker option by default for Linux/x86 targets, reducing MAXPAGESIZE to 4kB again. Well, for 32 bit systems it really took until v2.32, commit 872899f1efed, to fix a typo in the configure script to correctly detect 32 bit systems.

After commit 9833b7757d24 added more confusion2 to the general topic regarding MAXPAGESIZE, the issue was finally fixed for good (21 years later!) in binutils v2.40 with commit a2267dbfc9e1 which unified the definition of [ELF_]MAXPAGESIZE to 0x1000, i.e. made the linker use the regular 4kB page size for all cases, independent of configuration options used to build ld.

Wrapping up the Pieces

Binutils ld versions from v2.12 on were creating ELF files with too large of a segment alignment. The rationale behind this behaviour was questionable back then, something I even noticed in the context of glibc in 2011. However, the discussion around the topic came to no final conclusion nor did the binutils developers even react to the question brought up. It was also the time when Ulrich Drepper was still in charge of glibc development with little interest in change, shutting down the discussion by force.

Versions from v2.40 on are fine, but all versions in between may cause issues, as sketched in the Test Your Systems section below.

Suspect #3: The Runtime Linker

The runtime linker ld-linux.so is our last partner in crime. It is part of libc and is tasked with finishing the loading of a dynamically-linked program, leading to the eventual execution of main().

The runtime linker, also known as the ELF interpreter, gets loaded by the Linux kernel as part of loading a ELF binary. It’s basically an additional program (which is an ELF file itself) loaded along with the segments to be loaded in the actual program. If a program has an interpreter (the regular case for dynamically-linked programs), the kernel will pass on execution not directly to the ELF program but its interpreter instead.3

The ELF interpreter is responsible for the more heavy lifting of loading additional required libraries, like libc.so, resolving dependencies among these and calling constructors as needed. Lastly, it hands over the execution to the program’s entry point which, in turn, will call main().

During library loading, the runtime linker will have to open and parse the respective ELF files to know which parts of the file to mmap(). Since the end of 2021 with glibc v2.35, namely commit 718fdd87b1b9, glibc honours segment alignment while doing so and takes care of aligning mappings accordingly.

That change was further enhanced, still in v2.35, in commit e22a4557eb39d, to always use the maximum alignment of all loaded segments. Its commit log even references the before mentioned kernel commit.

Relevant code snippets can be found here and here.

This means, for example, that if a program has a data section that has large alignment requirements, e.g., to make it naturally fit huge pages, this alignment will propagate to the text section as well. Yet another unfortunate side effect, leading to ASLR degradation.

Chain of Failure

Summarizing the above, the recipe for failure is having binaries or libraries built with an old enough toolchain that was creating segments with too large of an alignment and using these with a recentish version of glibc (v2.35+) or, alternatively, a recentish kernel (v5.10+).

For such a setup, the following may be observed:

  1. The runtime linker of glibc would align loaded dependent libraries and align their virtual memory addresses in accordance with their segment alignment, i.e. to a 2MB boundary.
  2. The in-kernel ELF loader will honour larger-than-page segment alignment of loaded ELF programs, aligning their virtual addresses accordingly.

Both will lead to reducing the number of randomized address bits of the corresponding mappings, in turn, weakening ASLR and making brute force attacks more likely to succeed.

Binutils is Fixed! Why Bother with a Blog?

One might think, with binutils effectively4 being fixed 6 years ago, there should be no vulnerable binaries left. All current distributions, for sure, use a binutils that’s newer than this! However, keep in mind that the required glibc/kernel versions can come later, i.e. previously-created binaries become vulnerable.

Evaluating Debian

To evaluate this idea, I installed Debian Docker containers for various versions, starting with etch5, the first version to support the x86-64 architecture.

The containers were composed from the official debian image source, respectively the community supported debian/eol one for EOL versions and only extended by installing the binutils package.

From these containers we get the following distribution of binutils versions:

    etch   GNU ld version 2.17 Debian GNU/Linux
   lenny   GNU ld (GNU Binutils for Debian) 2.18.0.20080103
 squeeze   GNU ld (GNU Binutils for Debian) 2.20.1-system.20100303
  wheezy   GNU ld (GNU Binutils for Debian) 2.22
  jessie   GNU ld (GNU Binutils for Debian) 2.25
 stretch   GNU ld (GNU Binutils for Debian) 2.28
  buster   GNU ld (GNU Binutils for Debian) 2.31.1
bullseye   GNU ld (GNU Binutils for Debian) 2.35.2
bookworm   GNU ld (GNU Binutils for Debian) 2.40
     sid   GNU ld (GNU Binutils for Debian) 2.41.50.20231214

Correlating this with the information gathered from the binutils history section, we can deduce that Debian versions from etch to stretch were creating problematic ELF files. Only from buster on is a recent enough version of the linker in use and, fortunately, with a default setting for -z separate-code making it create 4k page-aligned ELF segments.

So, all good, nothing to fuss about?… Well, there are two additional aspects to keep in mind:

  1. Debian is a binary distribution that carries once-compiled packages over to the next release as-is if there have been no changes to the package itself which would require a rebuild.

  2. Third-parties regularly provide pre-built binary packages, explicitly built on older distributions to get a broad range of Linux distributions covered.

Looking further, I can find the following artifacts below /bin on my Debian sid based test system (check_align.sh gets covered in a later section):

minipli@nuc:~$ cat /etc/debian_version
trixie/sid
minipli@nuc:~$ uname -a
Linux nuc 6.6.15-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.6.15-2 (2024-02-04) x86_64 GNU/Linux
minipli@nuc:~$ ./check_align.sh /bin/
/bin/bdftruncate (max align=0x200000)
/bin/mkfontscale (max align=0x200000)
/bin/ucs2any (max align=0x200000)
/bin/bdftopcf (max align=0x200000)
/bin/fonttosfnt (max align=0x200000)

I have no idea what bdftruncate is supposed to do—nor what any of the others do. But, apparently, it is old code that didn’t get rebuilt in the recent past and thereby suffers from the enforced huge default alignment of older toolchains, in this case 2MB.

Let's do some testing:

minipli@nuc:~$ for i in $(seq 10); do
> sleep 0.2 | /bin/bdftruncate 0x3200 &
> sleep 0.1
> head -1 /proc/$(pidof bdftruncate)/maps
> wait
> done 2>/dev/null
55c4dc600000-55c4dc602000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
555b41000000-555b41002000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
55af25400000-55af25402000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
561cb6400000-561cb6402000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
55a709800000-55a709802000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
55bc2f400000-55bc2f402000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
55ac8d200000-55ac8d202000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
556102000000-556102002000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
55b58fe00000-55b58fe02000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate
55fa2a400000-55fa2a402000 r-xp 00000000 fe:08 2883692                    /usr/bin/bdftruncate

The above shows the output of executing bdftruncate 10 times and looking at the very first map in its memory mappings, which happens to be for the binary's executable segment. It shows that the 21 least significant bits of that mapping are zero for all test runs, as the Linux kernel ELF loader honored the alignment constraints of the binary, reducing the quantity of randomized address bits.

bdftruncate probably isn’t an interesting target, however, it serves as an example that legacy baggage of previous releases continues to haunt us.

Third-Party Binaries

Another cause are third-party programs, for example Slack’s Debian Linux package has these:

minipli@nuc:~/src/paxtest/contrib (master)$ ./check_align.sh /lib/slack/
/lib/slack/[...]/windows-quiet-hours/build/Release/quiethours.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.foundation/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.applicationmodel/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.data.xml.dom/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.ui.startscreen/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.ui.notifications/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/cf-prefs/build/Release/cf-prefs.node (max align=0x200000)
/lib/slack/[...]/macos-notification-state/build/Release/notificationstate.node (max align=0x200000)
/lib/slack/[...]/file-handler-info/build/Release/file_handler_info.node (max align=0x200000)
/lib/slack/[...]/windows-focus-assist/build/Release/focusassist.node (max align=0x200000)
[...]

VMware's Workstation bundle has these:

minipli@nuc:~/src/paxtest/contrib (master)$ ./check_align.sh /lib/vmware/
/lib/vmware/lib/libsigc-2.0.so.0/libsigc-2.0.so.0 (max align=0x200000)
/lib/vmware/lib/libpango-1.0.so.0/libpango-1.0.so.0 (max align=0x200000)
/lib/vmware/lib/libdbus-1.so.3/libdbus-1.so.3 (max align=0x200000)
/lib/vmware/lib/libcairo.so.2/libcairo.so.2 (max align=0x200000)
/lib/vmware/lib/libfontconfig.so.1/libfontconfig.so.1 (max align=0x200000)
/lib/vmware/lib/libgck-1.so.0/libgck-1.so.0 (max align=0x200000)
/lib/vmware/lib/libgio-2.0.so.0/libgio-2.0.so.0 (max align=0x200000)
/lib/vmware/lib/libXss.so.1/libXss.so.1 (max align=0x200000)
/lib/vmware/lib/libglibmm_generate_extra_defs-2.4.so.1/libglibmm_generate_extra_defs-2.4.so.1 (max align=0x200000)
/lib/vmware/lib/libepoxy.so.0/libepoxy.so.0 (max align=0x200000)
[...]

These two applications are, for sure, interesting targets for attackers. Having their memory layout made easier to predict is very much not intended by their respective creators. Getting these fixed, however, requires the involved companies to switch to a more recent build system for creating their binaries, which brings us to our next section.

Call to Action

If you’re reading this and happen to supply pre-built binaries to the public yourself, please check your build environment to ensure it creates proper x86-64 binaries that won’t accidentally weaken ASLR.

Read on for how to do that with a simple command and a script we're providing today.

Test Your Systems!

Check ld

If all your binaries and libraries are built with a binutils version that’s at least v2.40, you’re good. If it’s an earlier version, it depends.

For versions down to v2.31 it depends on the build time configuration. You can check these by asking ld for the default value of its -z separate-code option like this:

$ ld --help | grep 'separate-code.*default'
  -z separate-code            Create separate code program header (default)

If the output looks like the above, you’re good as well. If it instead appears as below (like, for example, it used to be on older versions of Alpine Linux), your ELF files have too large of an alignment:

# ld --help | grep 'separate-code.*default'
  -z noseparate-code          Don't create separate code program header (default)

If the command generates no output at all, you’ve either typoed it 😉 or are using a version of ld that pre-dates that commit. In this case, it’s also generating ELF files with too large of an alignment, leading to ASLR degradation.

Check Your Binaries

Testing ld is a sufficient test for newly created code, but it isn't for already-existing binaries and libraries.

To check if your system contains problematic binaries or libraries, we wrote a little script. It’s part of the recently-created paxtest GitHub repository which also contains tests for the previously-mentioned operating system behavior related to this issue.

The script gets passed a path or individual files to look at for problematic binaries and, ideally, generates no output at all. If it does, however, it shows the path of the problematic ELF file and the maximum alignment it encountered in the program header’s load segments. Usually a recompile of the programs or libraries in question with a recent toolchain is sufficient to fix the issue.

If check_patch.sh reports a file that is statically-linked, it is, strictly speaking, not subject to the ASLR degradation. Non-PIE executables (static or not) have no randomization applied to them to begin with.

In case check_sec.sh flags a file that you don’t control, e.g., a binary from your distribution, ask the respective maintainer (nicely, please) to recompile affected binaries. That way, the binaries will then get full ASLR treatment.

Aftermath

With supply chain attacks on everyone’s radar, we should remember the importance of the toolchain's involvement as well. It doesn’t even need malicious modifications if the pristine provided sources already help out an attacker.

This whole saga went mostly unnoticed, if it hadn't been for Justin Miller’s earlier blog post and our team’s philosophy at OSS pushing for tests that made us take a closer look. The toolchain's involvement is especially subtle and powerful, that’s something one wouldn’t expect.

Trusting the toolchain we use to build programs and libraries is essential in building a secure system. Implementing measures to handle cases where this trust may get broken helps to actually make such a system trustworthy.

TL;DR

Starting from 2001 and continuing until 6 years ago with version 2.32, binutils' ld linker set too large of an alignment on ELF binary sections. With a Linux kernel >= 5.10 or glibc >= 2.35, binaries/libraries that were built with the older toolchain act as timebombs against ASLR, making brute-force attacks easier on 64-bit binaries and reducing randomness to nothing in some cases for 32-bit binaries.

With the kernel/glibc changes having not been reverted, affected binaries with attack surface should be rebuilt with a newer toolchain to regain full ASLR benefits. See Call to Action section above for detection/remediation steps.