Real-Time Linux in 2026: PREEMPT_RT Basics, Tuning, and How to Prove Latency

Practical guide to PREEMPT_RT on embedded Linux covering kernel configuration, IRQ threading, latency measurement with cyclictest, and tuning techniques for deterministic response times.

Published 23 March 2026

Real-Time Linux in 2026: PREEMPT_RT Basics, Tuning, and How to Prove Latency

PREEMPT_RT transforms the Linux kernel into a viable platform for soft and firm real-time applications. After years as an out-of-tree patch set, most of PREEMPT_RT has been merged into mainline Linux as of kernel 6.x. This guide covers what PREEMPT_RT actually does, how to configure and tune it, and—critically—how to prove that your system meets its latency requirements.

What PREEMPT_RT changes

Standard Linux is optimized for throughput. The kernel disables preemption during critical sections, holds spinlocks with interrupts disabled, and batches work for efficiency. This is excellent for servers but produces unpredictable worst-case latencies ranging from hundreds of microseconds to several milliseconds.

PREEMPT_RT makes the following changes:

Spinlocks become sleeping locks: Most kernel spinlocks are replaced with rt-mutexes that can sleep, allowing higher-priority threads to preempt lower-priority ones even inside kernel code paths.
Interrupt handlers run as threads: Hardware interrupt handlers are converted to kernel threads with configurable priorities. This means a high-priority real-time task can preempt interrupt handling.
Priority inheritance: rt-mutexes implement priority inheritance to prevent priority inversion—a low-priority task holding a lock needed by a high-priority task temporarily inherits the high priority.
High-resolution timers: Timer resolution is improved to the hardware timer's native resolution (typically 1 µs or better).

The net effect: worst-case scheduling latency drops from milliseconds to tens of microseconds on well-tuned systems.

Kernel configuration

Start with your target platform's defconfig and add:

CONFIG_PREEMPT_RT=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_NO_HZ_FULL=y
CONFIG_CPU_FREQ=n        # or use performance governor
CONFIG_CPU_IDLE=n         # or configure shallow idle states

Disabling CPU frequency scaling and deep idle states eliminates two major sources of latency jitter. If power consumption matters, use the performance cpufreq governor instead of disabling it entirely, and limit C-states to C1.

IRQ thread priorities

With PREEMPT_RT, each hardware interrupt gets a kernel thread. You can see them with ps:

ps -eo pid,cls,rtprio,comm | grep -i irq

Set priorities based on your application's requirements. A typical pattern:

Priority	Thread	Rationale
99	Watchdog	Must never be starved
90	Real-time application	Primary real-time task
80	Timer IRQ	Drives scheduling decisions
50	Network IRQ	Important but not time-critical
30	Storage IRQ	Background I/O

Use chrt to set thread priorities:

chrt -f -p 80 $(pgrep irq/27-timer)

Or configure them in udev rules or systemd unit files for persistence.

Measuring latency with cyclictest

cyclictest is the standard tool for measuring scheduling latency on PREEMPT_RT systems. It creates high-priority threads that sleep for a defined interval and measures how long they actually slept (the difference is the scheduling latency).

# Basic latency test (10 minutes, 1000 µs interval, RT priority 90)
cyclictest -m -Sp90 -i1000 -h400 -q -D 10m

# Output includes:
# T: 0 Min: 3 Act: 8 Avg: 6 Max: 42

The number that matters is Max. The maximum latency over the test duration represents the worst case observed. For a meaningful measurement, run cyclictest for at least 24 hours under load.

Stress during measurement

Latency measured on an idle system is meaningless. Run stress workloads simultaneously:

# CPU stress
stress-ng --cpu $(nproc) --timeout 86400 &

# I/O stress
stress-ng --hdd 4 --timeout 86400 &

# Network stress (if applicable)
iperf3 -s &  # on target
iperf3 -c target-ip -t 86400 &  # from host

# Memory pressure
stress-ng --vm 2 --vm-bytes 80% --timeout 86400 &

Run cyclictest with all stressors active. The maximum latency under combined stress is your actual worst-case bound.

Common latency sources and fixes

CPU frequency scaling

Symptom: Periodic latency spikes of 100–500 µs
Fix: Set the performance governor or disable cpufreq entirely

echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

SMI (System Management Interrupts)

Symptom: Unpredictable latency spikes of 50–300 µs that appear even with PREEMPT_RT
Fix: SMIs are handled by the BIOS/firmware and are invisible to the kernel. On x86, disable them in BIOS if possible. On ARM SoCs, SMIs are less common but TrustZone calls can have similar effects.

USB and GPU drivers

Symptom: Latency spikes correlated with USB activity or display updates
Fix: Move USB and GPU IRQ threads to low priorities. Consider disabling USB autosuspend.

Memory allocation

Symptom: Occasional spikes of several milliseconds
Fix: Pre-allocate all memory before entering the real-time loop. Use mlockall(MCL_CURRENT | MCL_FUTURE) to prevent page faults. Avoid malloc in the real-time path.

#include <sys/mman.h>

int main() {
    mlockall(MCL_CURRENT | MCL_FUTURE);
    // Pre-allocate buffers
    // Enter real-time loop
}

CPU isolation

For the tightest latency bounds, isolate one or more CPU cores from the kernel's general-purpose scheduler:

# Kernel command line
isolcpus=2,3 nohz_full=2,3 rcu_nocbs=2,3

This removes CPUs 2 and 3 from the general scheduler. Pin your real-time application to these cores using taskset or pthread_setaffinity_np. The isolated cores handle only your real-time threads and their explicitly assigned IRQs.

How to prove latency

Measuring worst-case latency is necessary but not sufficient for safety-critical systems. To "prove" latency:

Define the requirement: "The control loop must complete within 100 µs of the trigger event, 100% of the time"
Measure under worst-case conditions: Run cyclictest for 72+ hours with all stress workloads active
Add margin: If your measured worst case is 42 µs, your proven bound is perhaps 60 µs (with engineering margin). Do not claim 42 µs as the guarantee.
Document the test conditions: Hardware, kernel version, RT config, stress workloads, test duration, ambient temperature
Repeat after any kernel or hardware change

For safety-certified systems (IEC 62304, DO-178C), formal verification of the kernel's worst-case execution time may be required. PREEMPT_RT Linux can support SIL 2 with appropriate evidence packages.

When PREEMPT_RT is not enough

If your application requires sub-10 µs guaranteed latency or hard real-time certification, consider:

A dedicated RTOS co-processor (see the an RTOS vs embedded Linux comparison)
Xenomai or RTAI for hard real-time with Linux as a secondary OS
FPGA-based control loops for sub-microsecond determinism

PREEMPT_RT Linux is excellent for latency requirements in the 20–200 µs range under well-controlled conditions.

Integration with ProteanOS

ProteanOS kernel configurations include PREEMPT_RT support for applicable targets. The installation guide covers building RT-enabled kernels, and the the board bring-up documentation includes real-time validation as an optional milestone for latency-sensitive platforms.

Summary

PREEMPT_RT in 2026 is a mature, mainline technology. Configure it, isolate cores, lock memory, measure under stress, and document your results. The gap between PREEMPT_RT and a dedicated RTOS has narrowed significantly, making Linux viable for an increasingly wide range of real-time applications.

Last updated: March 23, 2026

Real-Time Linux in 2026: PREEMPT_RT Basics, Tuning, and How to Prove Latency

Related Pages