Toolchain Necromancy: Past Mistakes Haunting ASLR
By Mathias Krause
March 4, 2024
Introduction
There was a nice blog post by Justin Miller in January that showed how a performance optimization can lead to weakened security by making bruteforced memory corruption attacks more likely to succeed. If you haven’t read that article yet, please go ahead to get the reasoning behind the creation of this one.
After having read the article ourselves, we gave the kernel part a closer look, seeing if we can do something about the issue. Turns out, fixing the kernel was easy, so we just went ahead and did it the “PaX way” – making use of a “secure by default” reasoning with the possibility to mark exceptions via PaX flags where needed.
To verify if our changes were working as intended, we needed some tests
and PaXtest came to
mind. We extended it with relevant tests, but as it turned out, there was more involved
than just libc.so
not getting randomized properly.
What Justin wrote about in his blog isn’t the full story, there's more to it. It’s not only the kernel that plays a role, it’s also the runtime linker and ultimately the linker used to create the affected binaries.
The Suspects
Before getting into the gory details, we have to introduce the involved components. Let's dig in!
Suspect #1: The Linux Kernel
In Linux v5.10 a commit was added to the Linux kernel that changed the behaviour of the in-kernel ELF loader to honour segment alignment that is bigger than the system’s regular page size. It’s a different commit than what was mentioned in Justin’s article, as this one is specifically about the ELF loader and not about adding transparent huge-page support for filesystems. It’s somewhat related, but still a completely independent change.
The reason behind the change was, of course, performance gains. Hugepage-aligned mappings can make use of huge pages for the backing memory which should perform better because it uses fewer TLB entries (only one to cover a 2MB or even 1GB range on certain x86-64 systems) and therefore also requires fewer page table walks when accessing previously untouched data. It does, however, have a bigger setup time, as the full page (either 2MB or 1GB instead of only 4kB) needs to be populated. But if the full range is accessed, it can also save the overhead of triggering and handling page faults for demand loading/populating the memory region.
There have even been attempts to extend the change to cover static PIEs and the ELF program interpreter in v5.17 but these had to be reverted (here and here) in v5.18 because of regressions.
This is what the log of the original commit says:
The current ELF loading mechancism [sic] provides page-aligned mappings. This can lead to the program being loaded in a way unsuitable for file-backed, transparent huge pages when handling PIE executables.
For binaries built with increased alignment, this limits the number of bits usable for ASLR, but provides some randomization over using fixed load addresses/non-PIE binaries.
Tested by verifying program with -Wl,-z,max-page-size=0x200000 loading.
The part that says “limits the number of bits usable for ASLR” is
actually the interesting one. Obviously, it was very well recognized that
this change can weaken ASLR. But, apparently, back then it was assumed
that this behaviour needs to be explicitly requested by passing the
-Wl,-z,max-page-size=0x200000
linker flag during
compilation. While that may have been true for the author’s system, it
turns out, the real truth isn’t that simple, which brings us to our next
suspect.
Suspect #2: binutils
Binutils is a set of binary utilities – who would have guessed? 😉 –
to create, transform and analyze binary files. One of its tools is
ld
, the linker used by the compiler to combine
multiple object files to create executable programs or shared
libraries.
The previously mentioned compiler option
-Wl,-z,max-page-size=0x200000
will be passed on by the
compiler to the linker as ‑z max-page-size=0x200000
to change its behaviour according to
the default segment alignment. But, as this option only changes the
default behaviour: what is the default behaviour
of ld
when that option is missing – like in 99.9% of the
regular use cases?
To answer this question, we need to dig a little bit into binutils’
history. In particular, we'll be tracking the evolution of MAXPAGESIZE
, as it
implicitly defines the segment alignment for ELF files created by the
linker.
We’ll limit our time-traveling excursus to only x86-64 to narrow focus and make it easier to follow, as, despite intuition, the linker changes a lot! Feel free to skip ahead to the summary if you’re not into source code archaeology.
Turning Stones
Our journey starts in the year 2000 with binutils v2.11, when commit
2be3aa031f3d
added support for the x86-64 architecture, making use of a
MAXPAGESIZE
of only 4kB1.
The following year, in binutils v2.12, commit 35714f2a5daa changed this to 1MB. It was no security issue back then, as the kernel ELF loader didn’t care nor were CPUs available to actually run x86-64 code on.
The value of MAXPAGESIZE
was further changed five years later for
binutils v2.18 in commit f7661549c793
to its official value of 2MB, making it match the architecture’s
(only, at the time) huge page size, but, at the same time, making the latent issue only
worse by consuming yet another address bit.
Nothing really changed in regard to MAXPAGESIZE
until
2018 in binutils v2.31, when commit f6aec96dce1d added a
configure switch --enable-separate-code
to enable the
-z separate-code
linker option by default for Linux/x86
targets, reducing MAXPAGESIZE
to 4kB again. Well, for 32
bit systems it really took until v2.32, commit 872899f1efed,
to fix a typo in the configure script to correctly detect 32 bit
systems.
After commit 9833b7757d24 added more
confusion2 to the general topic regarding
MAXPAGESIZE
, the issue was finally fixed for good (21 years later!) in
binutils v2.40 with commit a2267dbfc9e1
which unified the definition of [ELF_]MAXPAGESIZE
to
0x1000, i.e. made the linker use the regular 4kB page size for all
cases, independent of configuration options used to build
ld
.
Wrapping up the Pieces
Binutils ld
versions from v2.12 on were creating ELF files with too large of a segment alignment. The rationale behind this behaviour was questionable
back then, something I even noticed
in the context of glibc in 2011. However, the discussion around the topic came
to no final conclusion nor did the binutils developers even react to the
question brought up. It was also the time when Ulrich Drepper was still
in charge of glibc development with little interest in change, shutting
down the discussion by force.
Versions from v2.40 on are fine, but all versions in between may cause issues, as sketched in the Test Your Systems section below.
Suspect #3: The Runtime Linker
The runtime linker ld-linux.so
is our last partner in crime. It is part of libc and is tasked with finishing the
loading of a dynamically-linked program, leading to the eventual execution of main()
.
The runtime linker, also known as the ELF interpreter, gets loaded by the Linux kernel as part of loading a ELF binary. It’s basically an additional program (which is an ELF file itself) loaded along with the segments to be loaded in the actual program. If a program has an interpreter (the regular case for dynamically-linked programs), the kernel will pass on execution not directly to the ELF program but its interpreter instead.3
The ELF interpreter is responsible for the more heavy lifting of
loading additional required libraries, like libc.so
,
resolving dependencies among these and calling constructors as needed.
Lastly, it hands over the execution to the program’s entry point which,
in turn, will call main()
.
During library loading, the runtime linker will have to open and
parse the respective ELF files to know which parts of the file to
mmap()
. Since the end of 2021 with glibc v2.35, namely commit 718fdd87b1b9,
glibc honours segment alignment while doing so and takes care of aligning
mappings accordingly.
That change was further enhanced, still in v2.35, in commit e22a4557eb39d, to always use the maximum alignment of all loaded segments. Its commit log even references the before mentioned kernel commit.
Relevant code snippets can be found here and here.
This means, for example, that if a program has a data section that has large alignment requirements, e.g., to make it naturally fit huge pages, this alignment will propagate to the text section as well. Yet another unfortunate side effect, leading to ASLR degradation.
Chain of Failure
Summarizing the above, the recipe for failure is having binaries or libraries built with an old enough toolchain that was creating segments with too large of an alignment and using these with a recentish version of glibc (v2.35+) or, alternatively, a recentish kernel (v5.10+).
For such a setup, the following may be observed:
- The runtime linker of glibc would align loaded dependent libraries and align their virtual memory addresses in accordance with their segment alignment, i.e. to a 2MB boundary.
- The in-kernel ELF loader will honour larger-than-page segment alignment of loaded ELF programs, aligning their virtual addresses accordingly.
Both will lead to reducing the number of randomized address bits of the corresponding mappings, in turn, weakening ASLR and making brute force attacks more likely to succeed.
Binutils is Fixed! Why Bother with a Blog?
One might think, with binutils effectively4 being fixed 6 years ago, there should be no vulnerable binaries left. All current distributions, for sure, use a binutils that’s newer than this! However, keep in mind that the required glibc/kernel versions can come later, i.e. previously-created binaries become vulnerable.
Evaluating Debian
To evaluate this idea, I installed Debian Docker containers for various versions, starting with etch5, the first version to support the x86-64 architecture.
The containers were composed from the official debian image source,
respectively the community supported debian/eol one for EOL
versions and only extended by installing the binutils
package.
From these containers we get the following distribution of binutils versions:
etch GNU ld version 2.17 Debian GNU/Linux
lenny GNU ld (GNU Binutils for Debian) 2.18.0.20080103
squeeze GNU ld (GNU Binutils for Debian) 2.20.1-system.20100303
wheezy GNU ld (GNU Binutils for Debian) 2.22
jessie GNU ld (GNU Binutils for Debian) 2.25
stretch GNU ld (GNU Binutils for Debian) 2.28
buster GNU ld (GNU Binutils for Debian) 2.31.1
bullseye GNU ld (GNU Binutils for Debian) 2.35.2
bookworm GNU ld (GNU Binutils for Debian) 2.40
sid GNU ld (GNU Binutils for Debian) 2.41.50.20231214
Correlating this with the information gathered from the binutils history section, we can deduce
that Debian versions from etch to stretch were creating problematic ELF
files. Only from buster on is a recent enough version of the linker in
use and, fortunately, with a default setting for
-z separate-code
making it create 4k page-aligned ELF
segments.
So, all good, nothing to fuss about?… Well, there are two additional aspects to keep in mind:
Debian is a binary distribution that carries once-compiled packages over to the next release as-is if there have been no changes to the package itself which would require a rebuild.
Third-parties regularly provide pre-built binary packages, explicitly built on older distributions to get a broad range of Linux distributions covered.
Looking further, I can find the following artifacts below
/bin
on my Debian sid based test system
(check_align.sh
gets covered in a later section):
minipli@nuc:~$ cat /etc/debian_version
trixie/sid
minipli@nuc:~$ uname -a
Linux nuc 6.6.15-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.6.15-2 (2024-02-04) x86_64 GNU/Linux
minipli@nuc:~$ ./check_align.sh /bin/
/bin/bdftruncate (max align=0x200000)
/bin/mkfontscale (max align=0x200000)
/bin/ucs2any (max align=0x200000)
/bin/bdftopcf (max align=0x200000)
/bin/fonttosfnt (max align=0x200000)
I have no idea what bdftruncate
is supposed to do—nor
what any of the others do. But, apparently, it is old code that didn’t
get rebuilt in the recent past and thereby suffers from the enforced huge
default alignment of older toolchains, in this case 2MB.
Let's do some testing:
minipli@nuc:~$ for i in $(seq 10); do
> sleep 0.2 | /bin/bdftruncate 0x3200 &
> sleep 0.1
> head -1 /proc/$(pidof bdftruncate)/maps
> wait
> done 2>/dev/null
55c4dc600000-55c4dc602000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
555b41000000-555b41002000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
55af25400000-55af25402000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
561cb6400000-561cb6402000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
55a709800000-55a709802000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
55bc2f400000-55bc2f402000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
55ac8d200000-55ac8d202000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
556102000000-556102002000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
55b58fe00000-55b58fe02000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
55fa2a400000-55fa2a402000 r-xp 00000000 fe:08 2883692 /usr/bin/bdftruncate
The above shows the output of executing bdftruncate
10
times and looking at the very first map in its memory mappings, which happens to be for the binary's executable segment. It shows that
the 21 least significant bits of that mapping are zero for all test
runs, as the Linux kernel ELF loader honored the alignment constraints
of the binary, reducing the quantity of randomized address bits.
bdftruncate
probably isn’t an interesting target,
however, it serves as an example that legacy baggage of previous
releases continues to haunt us.
Third-Party Binaries
Another cause are third-party programs, for example Slack’s Debian Linux package has these:
minipli@nuc:~/src/paxtest/contrib (master)$ ./check_align.sh /lib/slack/
/lib/slack/[...]/windows-quiet-hours/build/Release/quiethours.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.foundation/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.applicationmodel/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.data.xml.dom/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.ui.startscreen/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/@nodert-win10-au/windows.ui.notifications/build/Release/binding.node (max align=0x200000)
/lib/slack/[...]/cf-prefs/build/Release/cf-prefs.node (max align=0x200000)
/lib/slack/[...]/macos-notification-state/build/Release/notificationstate.node (max align=0x200000)
/lib/slack/[...]/file-handler-info/build/Release/file_handler_info.node (max align=0x200000)
/lib/slack/[...]/windows-focus-assist/build/Release/focusassist.node (max align=0x200000)
[...]
VMware's Workstation bundle has these:
minipli@nuc:~/src/paxtest/contrib (master)$ ./check_align.sh /lib/vmware/
/lib/vmware/lib/libsigc-2.0.so.0/libsigc-2.0.so.0 (max align=0x200000)
/lib/vmware/lib/libpango-1.0.so.0/libpango-1.0.so.0 (max align=0x200000)
/lib/vmware/lib/libdbus-1.so.3/libdbus-1.so.3 (max align=0x200000)
/lib/vmware/lib/libcairo.so.2/libcairo.so.2 (max align=0x200000)
/lib/vmware/lib/libfontconfig.so.1/libfontconfig.so.1 (max align=0x200000)
/lib/vmware/lib/libgck-1.so.0/libgck-1.so.0 (max align=0x200000)
/lib/vmware/lib/libgio-2.0.so.0/libgio-2.0.so.0 (max align=0x200000)
/lib/vmware/lib/libXss.so.1/libXss.so.1 (max align=0x200000)
/lib/vmware/lib/libglibmm_generate_extra_defs-2.4.so.1/libglibmm_generate_extra_defs-2.4.so.1 (max align=0x200000)
/lib/vmware/lib/libepoxy.so.0/libepoxy.so.0 (max align=0x200000)
[...]
These two applications are, for sure, interesting targets for attackers. Having their memory layout made easier to predict is very much not intended by their respective creators. Getting these fixed, however, requires the involved companies to switch to a more recent build system for creating their binaries, which brings us to our next section.
Call to Action
If you’re reading this and happen to supply pre-built binaries to the public yourself, please check your build environment to ensure it creates proper x86-64 binaries that won’t accidentally weaken ASLR.
Read on for how to do that with a simple command and a script we're providing today.
Test Your Systems!
Check ld
If all your binaries and libraries are built with a binutils version that’s at least v2.40, you’re good. If it’s an earlier version, it depends.
For versions down to v2.31 it depends on the build time
configuration. You can check these by asking ld
for the
default value of its -z separate-code
option like this:
$ ld --help | grep 'separate-code.*default'
-z separate-code Create separate code program header (default)
If the output looks like the above, you’re good as well. If it instead appears as below (like, for example, it used to be on older versions of Alpine Linux), your ELF files have too large of an alignment:
# ld --help | grep 'separate-code.*default'
-z noseparate-code Don't create separate code program header (default)
If the command generates no output at all, you’ve either typoed it 😉
or are using a version of ld
that pre-dates that commit. In
this case, it’s also generating ELF files with too large of an
alignment, leading to ASLR degradation.
Check Your Binaries
Testing ld
is a sufficient test for newly created code, but it isn't for already-existing binaries and libraries.
To check if your system contains problematic binaries or libraries, we wrote a little script. It’s part of the recently-created paxtest GitHub repository which also contains tests for the previously-mentioned operating system behavior related to this issue.
The script gets passed a path or individual files to look at for problematic binaries and, ideally, generates no output at all. If it does, however, it shows the path of the problematic ELF file and the maximum alignment it encountered in the program header’s load segments. Usually a recompile of the programs or libraries in question with a recent toolchain is sufficient to fix the issue.
If check_patch.sh reports a file that is statically-linked, it is, strictly speaking, not subject to the ASLR degradation. Non-PIE executables (static or not) have no randomization applied to them to begin with.
In case check_sec.sh
flags a file that you don’t
control, e.g., a binary from your distribution, ask the respective
maintainer (nicely, please) to recompile affected binaries. That way, the binaries will then get full ASLR treatment.
Aftermath
With supply chain attacks on everyone’s radar, we should remember the importance of the toolchain's involvement as well. It doesn’t even need malicious modifications if the pristine provided sources already help out an attacker.
This whole saga went mostly unnoticed, if it hadn't been for Justin Miller’s earlier blog post and our team’s philosophy at OSS pushing for tests that made us take a closer look. The toolchain's involvement is especially subtle and powerful, that’s something one wouldn’t expect.
Trusting the toolchain we use to build programs and libraries is essential in building a secure system. Implementing measures to handle cases where this trust may get broken helps to actually make such a system trustworthy.
TL;DR
Starting from 2001 and continuing until 6 years ago with version 2.32, binutils' ld linker set too large of an alignment on ELF binary sections. With a Linux kernel >= 5.10 or glibc >= 2.35, binaries/libraries that were built with the older toolchain act as timebombs against ASLR, making brute-force attacks easier on 64-bit binaries and reducing randomness to nothing in some cases for 32-bit binaries.
With the kernel/glibc changes having not been reverted, affected binaries with attack surface should be rebuilt with a newer toolchain to regain full ASLR benefits. See Call to Action section above for detection/remediation steps.