Reducing Maintenance Burden by Bending C

By Mathias Krause

August 27, 2024

Grsecurity is quite a big modification to the Linux kernel. Its exact size differs slightly for each kernel version, but it’s somewhere between 13 to 14 MB.

Maintaining such a patch isn’t always easy, especially when it comes to porting to new major versions of the Linux kernel. It can involve a lot of monotonous work, especially when it comes to small mechanical changes that cause conflicts because of unrelated code churn.

The patch used to be even bigger until recently, when we bit the bullet and addressed a major contributor to this unpleasant aspect of maintenance work via compiler magic and automation.

This blog post goes on to tell the story how we managed to chop off over a megabyte of changes from the patch without losing any functionality. We lowered the maintenance burden and simplified the adoption of core features in new code at the same time, simply by making use of modern C language constructs and even extending the language slightly to provide compile-time tests that would otherwise be impossible.

Atomics and Reference Counters

The Linux kernel is a highly concurrent piece of code – it has to be to work efficiently on current hardware which, even on the lower end, is equipped with multiple CPU cores.

To avoid data races, which could lead to all sorts of memory corruption issues, multiple primitives are provided to support mutual exclusion, like spinlocks, mutexes or atomic variables. The latter are often used to form the base for reference counters which are critical to control the lifetime of concurrently accessed objects.

Earlier versions of the Linux kernel freely mixed atomic variables with statistical counters, for example, to provide users with insight into network interface usage. Such counters can – and will – naturally overflow when the limits of the underlying data type are reached, e.g., when enough packets have been received.

While overflowing statistical counters is generally not a security concern – at worst some metrics are off – overflowing reference counters can lead to the premature release of an object which, in turn, paves the way for use-after-free (UAF) bugs which are among the top 10 most commonly exploited memory corruption bugs these days.

As of Linux v4.11, with commit f405df5de317, upstream kernels try to address this long-standing issue through the use of a dedicated type named refcount_t. This new type has over- and underflow semantics that will saturate the counter at a fixed value instead of allowing it to wrap, turning a possible UAF into a memory leak instead.

Reference counters which are identified as such, either via code review of their usage pattern or by commentary explicitly stating so, shall be converted to this new type. Code making use of such variables needs to be changed as well, as – because of the type difference – regular atomic operations can no longer be used and need to be switched to their refcount_*() counterparts.

Even though there have been quite some conversions over the past years, there is still pressure from maintainers, vetoing against such changes (e.g., here, here or here). This is not only for the sake of performance concerns, but also because of slightly different semantics where a conversion to the refcount API may introduce correctness issues and, in fact, has done so in the past (e.g. here, here and here).

True Origins

Grsecurity’s approach to the problem is much older1 and works quite differently, not only on the technical side but also from the design perspective. Instead of manually finding and converting reference counters, it assumes all atomic variables and structure members may potentially be used to do some form of accounting and will be instrumented.

It’s a rather drastic approach, but has no false negatives by design. All potential and actual reference counters are naturally covered as long as they are based on atomic counters; exceptions need to be explicitly marked as such.

For annotating exceptions, e.g., the network statistical counters from the previous example, a different type needs to be used, denoted by putting _unchecked in its type name.

Unchecked atomics have the same binary representation as their checked variants, that is, atomic_unchecked_t has the same size and alignment constraints as atomic_t has. Only the operations on it – implementations of the various *atomic*() functions – differ, i.e., not doing the over- and underflow checks which would normally be done for the checked variants.

While using a different type for, e.g., statistical counters is needed to ensure they’re excluded from getting instrumented, it also creates a maintenance cost. Not only is a change of the variable declaration and definition needed, also all code making use of the variable needs to use the unchecked operation primitives to avoid the compiler from rightfully complaining about passing a atomic_unchecked_t pointer to functions requiring an atomic_t pointer.

While changing a single instance is an easy undertaking, having to do it for more than a handful becomes tedious and begs for automation. Even more so, recent developments on our internal static analysis tools flagged many more potential atomics that would need to be converted to unchecked variants after a manual review. Having to change not only the variable or structure member type but also the code would increase the maintenance burden, so we were seeking a better solution which would do The Right Thing(TM) simply based on the type being used.

On the Lookout for a Solution

The Linux kernel used to be rather conservative when it comes to build requirements, to not needlessly exclude oldish distributions from being able to compile current kernels. That changed slightly in recent past, when the oldest supported gcc version was bumped to 4.9 during the development of Linux v5.8 and, even more recently, to gcc 5.1 for Linux v5.15.

The raise to gcc 4.9 has one nice implication, as mentioned by Linus in the commit:

I realize that we fairly recently raised it to 4.8, but the fact is, 4.9 is a much better minimum version to target.
[…]
In particular, raising the minimum to 4.9 means that we can now just assume _Generic() exists, which is likely the much better replacement for a lot of very convoluted built-time magic with conditionals on sizeof and/or __builtin_choose_expr() with same_type() etc.

Being able to make use of _Generic() is, indeed, a nice thing to have as it allows to write type-generic code way more easily and readable as it used to be – just what we wanted to do.

C11 to the rescue!

Even though _Generic() is an extension to the C language only available since C11, having a minimal gcc version requirement of 4.9 (or 5.1 for all still supported grsecurity kernels with v5.15 being the oldest) makes that support granted. And even if Linux v5.15 is still making use of the gnu89 C dialect for compiling kernel code (i.e., “C90 with GNU extensions”) gcc will still permit using _Generic() in C source files. In fact, it supports it even in its ISO-C90-conforming -ansi mode and will, at most, issue a warning but only under -pendantic, which is great as the latter isn’t used by the Linux kernel. Using _Generic() to make the compiler choose the right function solely based on the type involved is now only a few macros away.

First, a multiplexer choosing a fitting function based on the passed type is needed:

#define __atomic_op_pick(fpfx,tpfx,op,order,var) (_Generic((var),     \
    const tpfx##_unchecked_t *: fpfx##tpfx##_##op##_unchecked##order, \
    tpfx##_unchecked_t *      : fpfx##tpfx##_##op##_unchecked##order, \
    const tpfx##_t *          : fpfx##tpfx##_##op##order,             \
    tpfx##_t *                : fpfx##tpfx##_##op##order))

The above may look complicated as it uses the C preprocessor token pasting feature quite extensively but that is only because there are so many flavours of atomic operations that need to be handled in a generic manner. The key is the last macro argument var which is the actual atomic variable that should be passed to the atomic operation. By using _Generic(var), the above code multiplexes on var’s type which might either be an *_unchecked_t or regular atomic type.

As some operations (like atomic_read()) only want a const pointer, there are four types to handle and not only the two one would initially think of. But what this expression eventually evaluates to is the name of the function to use for a given operation. One may already see that all the cases only differ by some _unchecked substring being present or not. It’ll become clear once we have all parts of the puzzle and evaluate an exemplary expression.

To ensure __atomic_op_pick() actually compiles, twice the number of atomic functions are needed – first for the regular type, then for the unchecked counterpart. That’s because even though __atomic_op_pick() is handled by the preprocessor, _Generic() is evaluated only during compilation, requiring the whole expression to be valid and its referenced symbols to be known to the compiler.

Fortunately, this is easy as well. Since commit ace9bad4df26, the atomic API functions are generated by scripts below scripts/atomic/. We simply extended these to emit wrappers for _unchecked_t variants as well.

The only missing part is the final syntactical sugar removing the need to explicitly choose one function over the other. With the help of __atomic_op_pick() we can make the compiler choose a fitting function, simply by using yet another macro indirection. We, again, make use of the scripts generating the wrapper functions and end up with code like this (shortened and reformatted for readability):

#define atomic_op_pick(pfx,op,order,var)    __atomic_op_pick(,pfx,op,order,var)

static __always_inline void atomic_add(int i, atomic_t *v)
{
    /* checked atomic add, script generated */
}
static __always_inline void atomic_add_unchecked(int i, atomic_unchecked_t *v)
{
    /* unchecked atomic add, script generated */
}
#define atomic_add(i, v) atomic_op_pick(atomic,add,,v)(i, v)

The last line hooks ups all the magic. atomic_add() is now both a function implementing the checked atomic operation, as well as a macro calling either the checked or the unchecked variant, purely based on the type of the atomic argument passed in. This allows preserving all code as-is and requires only to change the type of a variable to get its semantics changed – the maintenance win we wanted to achieve.

Usage Example

On the usage side there’s no need to change the code anymore, as can be seen, for example, in the code for reading SNMP counters via /proc/net/snmp. The only change that’s needed is this:

diff --git a/include/net/snmp.h b/include/net/snmp.h
index 468a67836e2f..990088fbc847 100644
--- a/include/net/snmp.h
+++ b/include/net/snmp.h
@@ -62,7 +62,7 @@ struct icmp_mib {
 #define ICMPMSG_MIB_MAX        __ICMPMSG_MIB_MAX
 struct icmpmsg_mib {
-       atomic_long_t   mibs[ICMPMSG_MIB_MAX];
+       atomic_long_unchecked_t mibs[ICMPMSG_MIB_MAX];
 };

It marks the SNMP MIBs as uninstrumented for over- or underflow checking, making them plain atomic counters.

The usage side, e.g., reading the counters, can use the type-agnostic accessors which do not require further code changes:

static void icmpmsg_put(struct seq_file *seq)
{
    /* ... */
    struct net *net = seq->private;

    for (i = 0; i < ICMPMSG_MIB_MAX; i++) {
        val = atomic_long_read(&net->mib.icmpmsg_statistics->mibs[i]);
        /* ... */
    }
    /* ... */
}

The above atomic_long_read(…) in the for-loop will expand to atomic_long_op_pick(atomic_long,read,,…)(…), one of the atomic type flavours based on atomics using long as its underlying integer type, but otherwise similar to regular atomics.

It’ll expand further to atomic_op_pick and, in turn, to the already known __atomic_op_pick() which will, given the type of &net->mib.icmpmsg_statistics->mibs[i] being atomic_long_unchecked_t *, evaluate to atomic_long##_##read##_unchecked or, after preprocessing, to the correct accessor function atomic_long_read_unchecked.

Sequel

With the above, changing a checked to an unchecked atomic is now as easy as just changing the type of the underlying member. No further changes are needed, the compiler will choose the correct implementation based on a variable’s type.

These changes allowed us to remove what accounted for ~1MB of now no longer required changes from the patch, making forward porting easier and less mundane, allowing us to spend more time on implementing new features from the only-ever increasing TODO list.

That could be it, but the idea of making the compiler choose an appropriate wrapper function felt so good, we looked for further boilerplate changes to get rid of. Get ready for part 2!

CONSTIFY

One of the core hardening features of grsecurity is to prevent runtime memory corruption attempts, like overwriting function pointers in global objects with pointers to shellcode or otherwise unexpected values. To achieve this goal, up to the point worthwhile and maintainable, eligible individual static objects or, more commonly, whole types are annotated and instances moved to read-only storage with the help of a compiler plugin. A type annotated with __do_const will be handled as if all instances had been declared const – fully automatic, no further source code changes needed.

Mainline Linux does similar things—up to a certain extent, e.g., by generating checkpatch.pl warnings when trying to introduce new struct file_operations objects which aren’t const or by annotating global variables with __ro_after_init, allowing these to get modified during system initialization only. The __ro_after_init section attribute causes such annotated objects to be grouped together into a dedicated section that gets write-protected after the kernel finishes initializing.

These schemes fulfill many use cases, but sometimes objects containing function pointers cannot be made read-only because they also contain additional members that need to be modified at runtime, e.g., to support chaining multiple such objects via list_heads. One example is the objects representing the set of supported network protocols, which may be extended by loading a kernel module while the system is running.

To support such objects that should be read-only most of the time but still need to be written to from time to time, another annotation can be used in grsecurity kernels: __mutable_const. It’s a shorthand for __attribute__((mutable_const)) telling the compiler plugin to, again, put such annotated static objects into a section that’ll be write-protected but otherwise allow (in-source) write operations to such objects. To ensure these will succeed at runtime, the write operations need to be wrapped by runtime support functions that temporarily allow the write access for the current CPU. The compiler plugin in grsecurity takes care of this by wrapping these writes with calls to pax_open_kernel() and pax_close_kernel().

This, of course, allows only legitimate write accesses to such objects, i.e., write accesses that have been in the source code originally. Stray write accesses, e.g., from an exploit attempt, will still try to directly modify the object and, because of its write protection, generate an access fault, leading to the detection and prevention of the exploit attempt.

This scheme works great for specialized types like struct drm_driver, struct static[_call]_key, struct nf_sockopt_ops, struct netlink_dump_control, struct notifier_block, struct [bin_]attribute, struct attribute_group, struct tty_ldisc_ops, struct pernet_operations or struct proto. However, it cannot be used for generic container types like struct list_head which aren’t used on their own most of the time, but rather are embedded within other structure types to support easy chaining of associated objects.

The access functions for the corresponding list operations (i.e., list_add(), list_del(), etc) take a formal argument of a struct list_head *. If such a list member is embedded in a constified object, the list operation’s implementation, which is in a different compilation unit, won’t wrap the write access as needed and will trigger a fault at runtime.

To avoid these, the list_*() operation calls need to be changed to their pax_list_*() counterparts which do the necessary wrapping to temporarily disable write-protections around the list modifying operation. However, finding and modifying all of them is cumbersome, so we wanted to automate this, much like we did for checked vs. unchecked atomics. Unfortunately, _Generic() is only limited to distinguish types, it cannot be used to multiplex on a type’s annotation which is purely based on type attributes.

As type attributes aren’t generally visible at the language level, we could not make use of the _Generic() trick we used for atomics. Neither can we make use of any other compiler provided builtin that allows distinction of expressions, as we simply cannot tell a mutable and a regular object apart. Well, we can – within the CONSTIFY plugin, because it has to. But the CONSTIFY plugin doesn’t export any functionality to make that distinction available on the language level, so we had to implement a builtin to do just that. Easy!

Language Bending

Adding a new builtin function to the compiler may sound like a huge undertaking and if one starts looking, one easily can get the impression it actually is. All builtin functions are rather statically defined in various .def files that get included multiple times to create data structures around them, e.g., builtins.def lists all the common builtin functions and tree-core.h includes it to create enum built_in_function which gives each a unique identifier. Trying to extend these from within a compiler plugin feels impossible. However, there’s a way that makes it actually pretty easy – at least within GCC because of its broad target support.

The way we did it is to chain on the resolve_overloaded_builtin hook, which supports implementing target-specific builtins. Even though ours isn’t really target specific, it’s an easy way to hook the compiler the way we need it. Because not only can one implement builtin functions this way early in the compilation process, one’s also not bound to the constraints of the language. One can easily make a builtin have a variable amount of arguments and have access to their individual types as well without the need to pass a format string or similar out-of-band meta information that would still be incomplete, at best.

We already made use of this when implementing the TYPE_CANARY plugin, providing the necessary infrastructure to implement new builtins even more easily, simply by allocating a new numeric value for the function, as demanded by the internal API, and a declaration so the parser will recognize the new builtin. The implementation is as easy as taking the arguments passed and analyzing them as needed. In our case it would be to look for the mutable_const attribute on the referenced variable or type declaration. The returned value has to be a code fragment. As we want the builtin to be usable as a constant expression to further decide on the function to be used for a given call, we either return the constant expression 1 if one of the arguments is referring to a mutable_const object/type or 0 if it’s not.

The CONSTIFY plugin uses a PLUGIN_START_UNIT callback to register the new builtin function and chains the resolve_overloaded_builtin target hook to ensure the builtin’s implementation will be hooked accordingly. The core looks as follows (condensed for readability):

#define BUILT_IN_PLUGINS_START  ... /* gcc version dependent start offset */
/* ... */
#define BUILT_IN_MUTABLE_P      (BUILT_IN_PLUGINS_START + 2)

static tree builtin_mutable_p_decl;

static tree constify_resolve_overloaded_builtin(unsigned int loc, tree decl, void *args)
{
    if (decl == builtin_mutable_p_decl)
        return builtin_mutable_p(loc, decl, args);
    /* ... */
}

static void constify_start_unit(void *gcc_data, void *user_data)
{
    tree fntype;

    /* int __builtin_mutable_p(expr,...); */
    fntype = build_varargs_function_type_list(integer_type_node, NULL_TREE);
    builtin_mutable_p_decl =
        add_builtin_function_ext_scope("__builtin_mutable_p", fntype, BUILT_IN_MUTABLE_P,
                                       BUILT_IN_MD, NULL, NULL);

    orig_resolve_overloaded_builtin = targetm.resolve_overloaded_builtin;
    targetm.resolve_overloaded_builtin = constify_resolve_overloaded_builtin;

    /* ... */
}

int plugin_init(struct plugin_name_args *plugin_info, struct plugin_gcc_version *version)
{
    /* ... */
    register_callback(plugin_info->base_name, PLUGIN_START_UNIT, constify_start_unit, NULL);
    /* ... */
}

plugin_init() is the API-defined entry point for GCC plugins which gets used to register a callback that gets executed before processing the compilation unit. This callback gets executed once and does two things: (1) declare a prototype int __builtin_mutable_p(...) and register it as a builtin function under the name __builtin_mutable_p and (2) hook the targetm.resolve_overloaded_builtin callback to handle calls to __builtin_mutable_p(). The implementation of the new builtin is provided by builtin_mutable_p which does the following:

static bool is_mutable_expr(const_tree expr, const_tree *res)
{
    /* If either expr's type or what 'expr' evaluates to (it can be an
     * arbitrary expression) has the 'mutable_const' attribute, return
     * true, otherwise false.
     */
}

static tree builtin_mutable_p(location_t loc, tree decl, void *_args)
{
    vec<tree, va_gc> *args = static_cast<vec<tree, va_gc> *>(_args);
    unsigned int arg_cnt = args ? args->length() : 0;
    const_tree first_expr = NULL_TREE;
    const_tree first_res = NULL_TREE;
    bool is_mutable = false;

    if (arg_cnt < 1) {
        error_at(loc, "function %qD needs at least one argument", decl);
        return NULL_TREE;
    }

    /*
     * Evaluate all expressions, even if we find a mutable one early, to get a
     * wide coverage of tree types.
     */
    for (unsigned int i = 0; i < arg_cnt; i++) {
        const_tree expr = VEC_index(tree, args, i);
        const_tree res = NULL;

        if (!is_mutable_expr(expr, &res))
            continue;

        is_mutable = true;
        if (first_expr)
            continue;

        first_expr = expr;
        first_res = res;
        gcc_assert(res);
    }

    if (verbose && first_expr) {
        const char *first = arg_cnt > 1 ? "first " : "";
        inform(EXPR_LOCATION(first_expr), "%smutable expression for %qD via %qE",
               first, decl, first_res);
    }

    return is_mutable ? integer_one_node : integer_zero_node;
}

That’s it! builtin_mutable_p() simply tests all of the passed arguments – which can be almost arbitrary expressions – for their mutability status and returns tree nodes for either 1 or 0, depending on finding at least one mutable argument.

Even though the above misses the core testing code is_mutable_expr(), its implementation is just enumerating the various possible tree types and calling is_mutable_expr() recursively until a decision can be made. But, overall, this is just plumbing work. Adding a new builtin is, as hopefully has been shown, not that hard after all.

The big upside of being able to enhance the compiler with new builtin functions is that these can then be used to extend the language, where needed – provide insights at the language level otherwise not possible to achieve. In our case it’s to probe a function’s arguments for the mutable_const attribute which, as the builtin will be evaluated early in the parser, generates a constant value that can be used to choose the appropriate function to be called.

In the spirit of the atomic wrappers, we implemented a function multiplexer like this:

#define list_op_pick(op,...) \
    __builtin_choose_expr(__is_mutable(__VA_ARGS__), pax_##op, op)(__VA_ARGS__)

static __always_inline void list_add(struct list_head *new, struct list_head *head)
{
    __list_add(new, head, head->next);
}

static __always_inline void pax_list_add(struct list_head *new, struct list_head *head)
{
    __pax_list_add(new, head, head->next);
}

#define list_add(n,h)   list_op_pick(list_add,n,h)

Now list_add() is a macro that makes use of list_op_pick() to choose which function to make use of – pax_list_add() for mutable_const objects or the regular list_add() for all other cases.

Switching between them is implemented with the help of __builtin_choose_expr() which is an existing compiler builtin function that evaluates either to its second or third argument, depending on the truth value of its first argument. In this case it’s a call to __is_mutable() which is a simple wrapper around the new builtin:

#define __is_mutable(exp, ...) __builtin_mutable_p(exp, ## __VA_ARGS__)

The reason why it gets passed a variable amount of arguments is simply to ensure that the pax_*() wrappers will be used, even if only one of the arguments (either the list head or the list element) are provably constified.2

Sequel, Take Two

We performed the above type multiplexing not only for struct list_head but also struct llist_node and struct hlist_node – basically all container types that may be constified either automatically or through manual annotation.

Doing so allowed us to chop off a few more kilobytes from the patch. Not as impactful as the atomics change, however, it eases maintenance for us and that was the main goal.

Key Takeaways

  • Modern C features can help simplifying code a lot. GNU extensions even more so.
  • When hitting the borders of the language, it’s not too difficult to extend it to provide the required knobs where needed.



Footnotes:
  1. PaX, and in turn grsecurity, added support for catching overflows of atomic variables as early as of Linux v2.6.26, specifically in pax-linux-2.6.26.2-test12.patch, which is 2008.↩︎

  2. Sometimes the mutability property gets lost by passing pointers to mutable objects through intermediate function calls. Other times mutable objects get chained to a non-constified list. But most of the time – all in-tree cases, at least – at least one of the arguments is, which is a sufficient enough predicate to trigger proper instrumentation.↩︎