Linux prepares eBPF to create task schedulers

Linux

We are a few days after the release of the stable version of Linux 6.10, a version that will include a series of quite interesting changes, as well as great improvements in terms of device support, functions and more.

In due course we will be talking about this release, since the reason for this article is in reference to the next expected version of Linux, which is "Linux 6.11", a release of which some changes have been announced that I do mention. Given enough time, I'd love to address them in another post.

Ok, now moving on to the point of the article, which is in reference to an announcement made by Linus Torvalds about their willingness to include in the Linux 6.11 kernel, some patches that implement the mechanism “sched_ext” (SCX).

This mechanismor is intended to use eBPF to create CPU schedulers within the Linux kernel. Here's a summary of how it will work:

  • eBPF and CPU Programmers: With the use of eBPF, CPU schedulers can be dynamically loaded and executed within the Linux kernel. Just-In-Time (JIT) compilation translates eBPF bytecode into machine instructions for execution.
  • SCHED_EXT class: This is a new programming class, whose kernel call priority is among the classes SCHED_IDLE and SCHED_NORMAL. BPF drivers linked to SCHED_EXT can handle tasks that have a lower priority than real-time execution, without affecting tasks already attached to the normal scheduler SCHED_NORMAL.
  • Operation: BPF drivers analyze queues of tasks waiting to be executed on the CPU and select which task to assign when a CPU core is freed. If there are no active BPF drivers on SCHED_EXT, tasks are handled using the scheduler SCHED_NORMAL.
  • Benefits: The mechanism sched_ext facilitates experimentation with different programming techniques and strategies in a dynamic way. This allows you to quickly create functional prototypes of programmers and replace them on the fly in production environments. For example, it can be tuned to fit the specific characteristics of an application and change the scheduling strategy based on system status and other factors.

It is worth mentioning that “sched_ext” was initially proposed for consideration by kernel developers in 2022, followed by the release of six patch revisions. Despite not being supported in the main kernel, Several distributions such as Ubuntu, Arch Linux, Fedora and NixOS offer the installation of "sched_ext" through additional packages. Canonical is considering including components of «sched_ext» in Ubuntu 24.10, and Valve is working on its integration for the Steam Deck. In Meta, the programmer based on «sched_ext» is already used in production infrastructure.

In addition, it is mentioned that, currently, approximately a dozen programmers based on "sched_ext", each with task scheduling logic defined in user space and loaded into the kernel using BPF programs.

  1. scx_layered: A hybrid scheduler that divides tasks into layers, each with its own scheduling strategy. Allows you to assign certain tasks to specific layers with guaranteed CPU resources or increase the priority of individual applications. Developed by Meta, its userspace logic is written in Rust.
  2. scx_rustland: Optimized to prioritize interactive tasks over CPU-intensive ones. For example, it improves FPS in the Terraria game during concurrent kernel compilation compared to the standard EEVDF scheduler. Developed by a Canonical employee, with logic in Rust.
  3. scx_lavd: Implements the LAVD (Latency-criticality Aware Virtual Deadline) algorithm, reducing latency in computer games and interactive tasks by considering the relevance of reducing delays and process progress. Developed by Igalia and Valve, with logic in Rust.
  4. scx_rusty, scx_rlfifo, scx_mitosis: Schedulers that balance task groups based on load, implement a simple FIFO scheduler, and bind task groups to CPU cores. All with Rust components.
  5. scx_central, scx_flatcg, scx_nest, scx_pair, scx_qmap, scx_simple, scx_userland: Examples of programmers with C components, demonstrating the various capabilities of “sched_ext”.

Finally, it's worth adding that Google is experimenting with using its own framework, ghOSt, to influence task scheduler decisions using BPF programs, and has begun migrating ghOSt to sched_ext. Additionally, Google is developing a port of “sched_ext” for ChromeOS.

Source: https://lkml.org