PythonBPF

Inspiration

We were tired of the verbosity of BCC (multiline strings of C code inside Python aren’t exactly pretty) and frustrated by the overhead of writing bpftrace programs. Learning a domain-specific language just to perform a narrow task never felt ideal.

We were also surprised that there was no truly pure-Python framework for writing eBPF programs for prototyping, nor a Python library that could load compiled eBPF object files. Everything serious was always in C or Rust.

We believe eBPF should be more approachable and extensible. Python offers unmatched ergonomics and data-processing capability, making it perfect for handling the large data volumes BPF programs generate. That became the inspiration behind this project.

What it does

Python-BPF is an LLVM IR generator for eBPF programs written entirely in Python. It uses llvmlite to generate LLVM IR and compiles that IR into LLVM object files. It uses a reduced Python grammar, so there is no need to write a custom parser.

Below is an example:

from pythonbpf import bpf, map, section, bpfglobal, compile
from pythonbpf.helper import ktime
from pythonbpf.maps import HashMap
from ctypes import c_void_p, c_int32, c_uint64

@bpf
@map
def last() -> HashMap:
    return HashMap(key=c_uint64, value=c_uint64, max_entries=3)

@bpf
@section("blk_start_request")
def trace_start(ctx: c_void_p) -> c_int32:
    ts = ktime()
    print(f"req started {ts}")
    return 0

@bpf
@bpfglobal
def LICENSE() -> str:
    return "GPL"

compile()

Key Features

  • Support for many helpers (tracked in issue #63), kernel structs from vmlinux, user-defined structs, globals, and multiple programs in a single file.
  • Fully supported inside Python notebooks.
  • Pylibbpf integration:
    • Python bindings to libbpf.
    • Support for PerfEventArray, ring buffers, and other map types.
  • Strong developer experience: clean Python code, full LSP support, no mixed-language files, excellent compatibility with AI-assisted tooling.

How we built it

Screenshot from 2025-11-30 01-58-08 Python-BPF is built as a Python-first compiler frontend that translates a subset of Python into eBPF-compatible LLVM IR. The system relies entirely on Python’s standard ast module: decorated functions (@bpf, @map, @section, @bpfglobal) are extracted and treated as independent BPF compilation units. This design avoids custom parsing and keeps the input language predictable for the verifier.

A dedicated IR emitter walks the AST and generates LLVM IR using llvmlite. This stage enforces eBPF constraints: explicit integer widths, restricted pointer usage, deterministic stack allocation at function entry, and exceptions for different eBPF program types. Helper invocations are lowered to LLVM calls with precise argument coercion, ensuring compatibility with the eBPF backend.

The generated IR is then compiled through the standard LLVM eBPF backend (llc -march=bpf), which emits a fully-validated .o containing BPF bytecode. This uses LLVM’s backend instead of re-implementing codegen logic.

For loading and runtime interaction, Python-BPF uses Python bindings to libbpf. Maps created in the IR are surfaced as Python objects, allowing lookups, updates, and event consumption using familiar Python semantics. This tight integration makes the entire workflow, from source to verified kernel program, completely Python-driven.

Challenges we ran into

Building a compiler for eBPF from pure Python turned out to be far more complex than expected. The biggest challenge was the lack of authoritative documentation for how LLVM IR should be structured to satisfy the BPF verifier. Most of our progress came from repeatedly generating IR, loading it, watching the verifier reject it, and reverse-engineering the patterns that worked.
We also ran into fundamental limitations of llvmlite, which exposes only a small portion of LLVM’s capabilities. Features we needed like proper debug metadata and certain IR constructs were missing. Several parts of the IR pipeline are built on hacks because llvmlite simply doesn’t expose the primitives we require. We’ve opened issues upstream for many of these.
Another major difficulty was reliably generating verifier-safe stack usage. eBPF requires all stack allocations to be static and declared at function entry, but Python’s AST gives no explicit notion of lifetimes or scopes. We had to write our own prepass to detect every variable that might ever need stack storage, across every control-flow path. Handling different program types (like XDP and soon sched-ext) also require a growing set of special-case rules. XDP in particular has strict pointer provenance requirements and verifier exceptions that forced us to introduce custom IR lowering logic. We also had to come up with a type system where we want the user to think less about the fine-grained details of the type of each variable, and handle it all behind the scenes. This proved to be challenging, especially with vmlinux.h

Accomplishments that we're proud of

  • Accepted to present PythonBPF at Linux Plumbers Conference 2025 in the eBPF track.
  • Presented early prototypes at the Innovations in Compiler Technology Conference 2025 at IISc Bangalore.

What we learned

  • We learned the hard way that compiler documentation for eBPF (especially the IR generation part) is not readily available as it is for other languages. We had to reverse engineer a lot of details by generating LLVM IR and playing around with it.
  • Generating debug info is one of the hardest software engineering challenges we came across because of it's poor support in llvmlite.
  • We also have daily fights with the verifier to make our IR work.
  • We found a bug on the LLVM backend for eBPF in the debug info generation part and we've sent a tiny PR to fix it.
  • We have also made multiple issues on llvmlite to add the features that we require to continue building PythonBPF (which are currently hacks).

What's next for Python-BPF

We’d like to onboard contributors as building a full compiler and ecosystem is too much for two people.
Planned improvements:

  • Full support for unions and bitfields present in kernel structs.
  • Extended helper coverage for sched-ext.
  • Robust XDP support, which requires implementing numerous verifier exception rules and custom variable-tracking logic.
  • Support for kfuncs and USDT probes.
  • We are thinking of porting popular projects like bpftop to PythonBPF after adding support for the required features (bpf_iter in this case).

P.S
You can run all the examples we showed off in the demo as well as in the images by cloning PythonBPF and checking out the BCC-Examples directory as well as the examples directory. You can check the try it out section of the README to try our examples

Built With

Share this project:

Updates