The
team behind
Tyr started 2025 with little to show in our quest to
produce a Rust GPU driver for Arm Mali hardware, and by the end of the
year, we were able to play SuperTuxKart (a 3D open-source racing
game) at the Linux Plumbers Conference (LPC). Our prototype was a joint
effort between Arm, Collabora, and Google; it ran well for the duration
of the event, and the performance was more than adequate for players.
Thankfully, we picked up steam at precisely the right moment: Dave
Airlie just
announced in the Maintainers Summit that the DRM subsystem
is only "about a year away
" from disallowing new drivers written in C
and requiring the use of Rust. Now it is time to lay out a
possible roadmap for 2026 in order to upstream all of this work.
Miguel Ojeda's talk at LPC this year summarized where Rust is being used in the Linux kernel, with drivers like the anonymous shared memory subsystem for Android (ashmem) quickly being rolled out to millions of users. Given Mali's extensive market share in the phone market, supporting this segment is a natural aspiration for Tyr, followed by other embedded platforms where Mali is also present. In parallel, we must not lose track of upstream, as the objective is to evolve together with the Nova Rust GPU driver and ensure that the ecosystem will be useful for any new drivers that might come in the future. The prototype was meant to prove that a Rust driver for Arm Mali could come to fruition with acceptable performance, but now we should iterate on the code and refactor it as needed. This will allow us to learn from our mistakes and settle on a design that is appropriate for an upstream driver.
A version of the Tyr driver was merged for the 6.18 kernel release, but it is not capable of much, as a few key Rust abstractions are missing. The downstream branch (the parts of Tyr not yet in the mainline kernel) is where we house our latest prototype; it is working well enough to run desktop environments and games, even if there are still power-consumption and GPU-recovery problems that need to be fixed. The prototype will serve the purpose of guiding our upstream efforts and let us experiment with different designs.
A kernel-mode GPU driver such as Tyr is a small component backing a much larger user-mode driver that implements a graphics API like Vulkan or OpenGL. The user-mode driver translates hardware-independent API calls into GPU-specific commands that can be used by the rasterization process. The kernel's responsibility centers around sharing hardware resources between applications, enforcing isolation and fairness, and keeping the hardware operational. This includes providing the user-mode driver with GPU memory, letting it know when submitted work finishes, and giving user space a way to describe dependency chains between jobs. Our talk (YouTube video) at LPC2025 goes over this in detail.
Having a working prototype does not mean it's ready for real world usage, however, and a walkthrough of what is missing reveals why. Mali GPUs are usually found on mobile devices where power is at a premium. Conserving energy and managing the thermal characteristics of the device is paramount to user experience, and Tyr does not have any power-management or frequency-scaling code at the moment. In fact, Rust abstractions to support these features are not available at all.
Something else worth considering is what happens if the GPU hangs. It is imperative that the system remains working to the extent possible, or users might lose all of their work. Owing to our "prototype" state, there is no GPU-recovery code right now. These two things are a hard requirement for deployability. One simply cannot deploy a driver that gobbles all of the battery in the system — making it hot and unpleasant in the process — or crashes and takes the user's work with it.
On top of that, Vulkan must be correctly implementable on top of Tyr, or we may fail to achieve drop-in compatibility with our Vulkan driver (PanVK). This requires passing the Vulkan Conformance Testing Suite when using Tyr instead of the C driver. At that point, we would be confident enough to add support for more GPU models beyond the currently supported Mali-G610. Finally, we will turn our attention to benchmarking to ensure that Tyr can match the C driver's performance while benefiting from Rust's safety guarantees. We have demonstrated running a complex game with acceptable performance, so results are good so far.
Some required Rust infrastructure is still work-in-progress. This includes Lyude Paul's work on the graphics execution manager (GEM) shmem objects, needed to allocate memory for systems without discrete video RAM. This is notably the case for Tyr, as the GPU is packaged in a larger system-on-chip and must share system memory. Additionally, there are still open questions, like how to share non-overlapping regions of a GPU buffer without locks, preferably encoded in the type system and checked at compile time.
On top of allocating GPU memory, modern kernel drivers must let the
user-mode driver manage its own view of the GPU address space. In the DRM
ecosystem, this is delegated to GPUVM,
which contains the common code to manage those address spaces on
hardware that offers memory-isolation capabilities similar to modern CPUs.
The GPU firmware also expects control over the placement of some
sections in memory, so it will not work until this capability is
available. Alice Ryhl is working on the Rust abstractions for
GPUVM as well as the io-pgtable abstractions
that are needed to manipulate the IOMMU page tables used to
enforce memory isolation. These are both based on the
previous work of
Asahi Lina, who pioneered the first Rust abstractions for the DRM
subsystem.
Another unsolved issue is DRM device initialization. The current code
requires an initializer for the driver's private data in order to return
a drm::Device
instance, but some drivers need the drm::Device to build
the private data in the first place, which leads to an impossible-to-satisfy
cycle of dependencies. This is also the
case for Tyr: allocating GPU memory through the GEM shmem API
requires a drm::Device, but some fields in Tyr's private
data need to store GEM objects — for example, to parse and boot the
firmware.
Lyude
Paul is working on this by introducing a drm::DeviceCtx
that encodes the device state in the type system.
The situation remains the same as when the first Tyr patches were
submitted: most of the roadmap is blocked on GEM shmem,
GPUVM, io-pgtable and the device
initialization issue. There is room to integrate some work by the Nova team, as
well: the register!
macro and bounded
integers. Once we can handle those items, we expect to quickly
become able to boot the GPU
firmware and then progress unhindered until it is time to discuss job
submission.
Another area needing consideration is the paths where the driver
makes forward progress on completing fences,
which are synchronization primitives that GPU drivers signal once jobs finish
executing. These paths must be carefully annotated or the system may
deadlock, and the driver must ensure that only safe locks are taken in the
signaling path. Additionally, DMA fences must
always signal in finite time, or someone elsewhere in the system may
block forever. Allocating memory using anything other than
GFP_ATOMIC must be disallowed, or the shrinker may kick in
under memory pressure and wait on the very job that triggered it. All of
this is covered in the documentation.
We conveniently ignore this in the prototype, meaning it can randomly
deadlock under memory pressure. Addressing this is straightforward: it
is just a matter of carefully vetting key parts of the driver. Doing so
elegantly, however, and perhaps in a way that takes advantage of Rust's type
system is something that remains to be discussed.
We have not touched upon what is next for Linux GPU drivers as a whole: reworking the job-submission logic in Rust. The current design assumes that drm_gpu_scheduler is used, but this has become a hindrance for some drivers in an age where GPU firmware can schedule jobs itself, and it's been plagued by hard-to-solve lifetime problems. Quite some time was spent at the X.Org Developer's Conference in 2025 discussing how to fix it.
The current consensus for Rust is to write a new component that
merely ensures that the dependencies for a given job are satisfied
before the job is eligible to be assigned in the GPU's ring buffer, at
which point the firmware scheduler takes over. This seems to be where
GPU hardware is going, as most vendors have switched to
firmware-assisted scheduling in recent years. As this component will not
schedule jobs, it will probably be called JobQueue instead.
This correctly conveys the meaning of a queue where new work is
deposited in and removed once the dependencies are met and a job is
ready to run. Philip Stanner has been spearheading this work.
The plan is to also expose an API for C drivers using a technique I have described here in the past. This will possibly be the first Rust kernel component usable from C drivers, another milestone for Rust in the kernel, and a hallmark of seamless interoperability between C and Rust.
One way that Tyr can fit into this overall vision is by serving as a
testbed for the new design. If the old drm_gpu_scheduler
can be replaced with the JobQueue successfully in the
prototype, it will help attest its suitability for other, more complex
drivers like Nova. Expect this discussion to continue for a while.
In all, Tyr has made a lot of progress this past year. Hopefully, it will continue to do so through 2026 and beyond.