Back Original

Checked-size array parameters in C

[LWN subscriber-only content]

Welcome to LWN.net

The following subscription-only content has been made available to you by an LWN subscriber. Thousands of subscribers depend on LWN for the best news from the Linux and free software communities. If you enjoy this article, please consider subscribing to LWN. Thank you for visiting LWN.net!

There are many possible programmer mistakes that are not caught by the minimal checks specified by the C language; among those is passing an array of the wrong size to a function. A recent attempt to add some safety around array parameters within the crypto layer involved the use of some clever tricks, but it turns out that clever tricks are unnecessary in this case. There is an obscure C feature that can cause this checking to happen, and it is already in use in a few places within the kernel.

The discussion started when Ard Biesheuvel sought to improve the safety of the poetically named function xchacha20poly1305_encrypt():

    void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
			           const u8 *ad, const size_t ad_len,
			       	   const u8 nonce[XCHACHA20POLY1305_NONCE_SIZE],
			       	   const u8 key[CHACHA20POLY1305_KEY_SIZE]);

A potential problem with this function is that it takes as parameters several pointers to arrays of type u8. As Biesheuvel pointed out, the size of the nonce and key arrays is not checked by the compiler, even though it is clearly specified in the function prototype. That makes it easy to, for example, give the parameters in the wrong order. The resulting vulnerabilities are generally not the outcome developers have in mind when they write cryptographic code.

Biesheuvel suggested that it was possible to write the prototype this way instead (differences shown in bold):

    void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
			           const u8 *ad, const size_t ad_len,
			       	   const u8 (*nonce)[XCHACHA20POLY1305_NONCE_SIZE],
			       	   const u8 (*key)[CHACHA20POLY1305_KEY_SIZE]);

The types of the last two arguments have changed; there is a new level of pointer indirection, with the argument being a pointer to an array of a given size. Callers must change their calls by adding an additional & operator to obtain the desired pointer type, but the address that is passed is the same. In this case, though, the compiler will check the sizes of the array passed, and will now catch a reordering of the arguments to the function.

Jason Donenfeld was interested by the idea, but he knew of an arguably more straightforward way to address this problem. It seems that, buried deep within the C standard, is a strange usage of the static keyword, making it possible to write the prototype as:

    void xchacha20poly1305_encrypt(u8 *dst, const u8 *src, const size_t src_len,
			           const u8 *ad, const size_t ad_len,
			       	   const u8 nonce[static XCHACHA20POLY1305_NONCE_SIZE],
			       	   const u8 key[static CHACHA20POLY1305_KEY_SIZE]);

This, too, will cause the compiler to check the sizes of the arrays, and it does not require changes on the caller side. Unlike the pointer trick, which requires an exact match on the array size, use of static will only generate a warning if the passed-in array is too small. So it will not catch all mistakes, though it is sufficient to prevent memory-safety and swapped-argument problems.

Eric Biggers pointed out that GCC can often generate "array too small" warnings even without static, but that the kernel currently disables those warnings; that would suppress them when static is used as well. (The warning was disabled in 6.8 due to false positives.) But he thought that adding static was worthwhile to get the warnings in Clang — if Linus Torvalds would be willing to accept use of "this relatively obscure feature of C".

Torvalds, as it turns out, has no objection to this usage; he likes the feature, if not the way it was designed:

The main issue with the whole 'static' thing is just that the syntax is such a horrible hack, where people obviously picked an existing keyword that made absolutely no sense, but was also guaranteed to have no backwards compatibility issues.

He pointed out that there are a number of places in the kernel that are already using it; for a simple example, see getconsxy() in the virtual-terminal driver. He suggested perhaps hiding it with a macro like min_array_size() just to make the usage more obvious, but didn't seem convinced that it was necessary. Donenfeld followed up with a patch to that effect a few days later, but then pivoted to an at_least marker instead.

A lot of work has gone into the kernel to make its use of C as safe as possible, but that does not mean that the low-hanging fruit has all been picked. The language has features, such as static when used to define formal array parameters, that can improve safety, but which are not generally known and not often used. In this particular case, though, it would not be surprising to see this "horrible hack" come into wider use in the near future.

Index entries for this article
KernelC language safety
KernelReleases/6.19