stable-mir

For a video version of this — https://www.youtube.com/watch?v=lfi2pCOaGGk&t=927s

Disclaimer

MIR - Rust’s mid-level IR

Simplified, control-flow-oriented representation.
Closer to machine code than HIR.
- this is where—borrow checking, optimizations such as ConstProp, CopyProp, dse, and monomorphization happens

Right now — MIR is lowered to LLVM IR
- or CLIF IR if we’re using the cranelift backend
But what if we could intercept MIR and do cool stuff with it
- like advanced analyses — formal verification (kani team at AWS driving this)
- support new hardware with different program execution models— i.e. write regular Rust that runs on accelerators (TPU, GPU etc.)

stable-mir

That’s where stable-mir comes in
MIR is rustc’s internal IR i.e. not meant to be stable and can (more like will) undergo changes between compiler versions.

“The goal of the Stable MIR project is to provide a stable interface to the Rust compiler that allow tool developers to develop sophisticated analysis with a reduced maintenance cost without compromising the compiler development speed.”

Stable MIR Design

stable-mir design

Added two crates to the Rust compiler,

stable_mir has been renamed to rustc_public
rustc_smir has been renamed to rustc_public_bridge
rustc_public is the user facing public API. There’s a proposal to have two of these
- One is to be published on crates.io. This will be the base of any minor update. This crate will compatible with multiple versions of the compiler. We will use conditional compilation based on the compiler version to do that.
- The other will be developed as part of rustc which will be kept up-to-date with the compiler, and it will serve as the basis for the next major release of rustc_public. This rustc_public has no compatibility or stability guarantees.
rustc_public_bridge —developed as part of the rustc library will interface with rustc’s internal APIs. Implements the interface between public APIs and the compiler internal APIs

rustc_public impl

1. Driver Integration via Macros

#[macro_export]
macro_rules! run {
    ($args:expr, $callback_fn:ident) => {
        $crate::run_driver!($args, || $callback_fn())
    };
}

The run! macro creates a Callbacks implementation that hooks into rustc's compilation pipeline at the after_analysis phase - after MIR generation but before codegen.

cd demo && cargo expand main 2>&1

The expansion shows that run!(&rustc_args, start_demo) expands to the following:

Defines a RustcPublic struct - Holds the callback and result
Implements Callbacks trait - Hooks into rustc's compilation pipeline via after_analysis
Calls run_compiler - Invokes rustc with the provided arguments
Executes your callback - Runs start_demo() after analysis is complete
Returns the result - Wrapped in Result<C, CompilerError<B>>

Macro instantiates the struct and and runs the driver at the end:

RustcPublic::new(|| start_demo()).run(&rustc_args)

This creates the driver, passes the callback, and runs the compiler with the arguments.

2. Thread-Local Context Management

scoped_tls::scoped_thread_local!(static TLV: Cell<*const ()>);

pub(crate) fn run<F, T>(interface: &dyn CompilerInterface, f: F) -> Result<T, Error>
where
    F: FnOnce() -> T,
{
    if TLV.is_set() {
        Err(Error::from("rustc_public already running"))
    } else {
        let ptr: *const () = (&raw const interface) as _;
        TLV.set(&Cell::new(ptr), || Ok(f()))
    }
}

Uses thread-local storage to maintain compiler context during analysis, preventing nested invocations.

3. Stable/Unstable Translation Bridge

pub fn run<F, T>(tcx: TyCtxt<'_>, f: F) -> Result<T, Error>
where
    F: FnOnce() -> T,
{
    let compiler_cx = RefCell::new(CompilerCtxt::new(tcx));
    let container = Container { tables: RefCell::new(Tables::default()), cx: compiler_cx };

    crate::compiler_interface::run(&container, || init(&container, f))
}

The bridge maintains:

Tables: Map between stable IDs and internal rustc representations
CompilerCtxt: Wrapper around TyCtxt for safe access to compiler internals

4. Visitor Pattern for MIR Analysis

cd rustc_public
cargo expand mir::visit
//! For every mir item, the trait has a `visit_<item>` and a `super_<item>` method.
//! - `visit_<item>`, by default, calls `super_<item>`
//! - `super_<item>`, by default, destructures the `<item>` and calls `visit_<sub_item>`

Provides a structured way to traverse and analyze MIR, similar to rustc's internal visitors.

Callback Execution Flow

Compilation Phase: rustc compiles the target crate and generates MIR
Hook Activation: after_analysis callback is triggered
Context Setup: Bridge establishes stable/unstable translation tables
User Callback: Your analysis function runs with access to stable APIs
Cleanup: Context is torn down, compilation continues or stops

This design ensures that external tools get a stable, safe interface to rustc's powerful analysis capabilities without directly depending on unstable rustc internals.

Internals:

What does MIR look like

Types in the MIR

Types appear after the colon (:) in variable declarations and expressions:

() - unit type (the return type of main)
i32 - 32-bit signed integer
(i32, bool) - tuple type for overflow checking results
&i32 - immutable reference to i32
(&i32,) - single-element tuple containing a reference
std::fmt::Arguments<'_> - formatting arguments with lifetime
[core::fmt::rt::Argument<'_>; 1] - array of 1 element
&[&str; 2] - reference to array of 2 string slices
&[core::fmt::rt::Argument<'_>; 1] - reference to array

All locals (_0 through _12) have explicit types declared[3].

Operations in the MIR

Operations are the computational actions performed, categorized as Statements and Terminators:

Statements (within basic blocks)

Assignments: _2 = 42_i32; - assigns constant to local
_3 = CheckedAdd(_2, 1_i32); - CheckedAdd operation that returns (result, overflow_flag) tuple
_1 = move (_3.0: i32); - Move operation extracting tuple field
_7 = &_1; - Borrow operation creating reference
_6 = (move _7); - Aggregate operation constructing tuple
_12 = CopyForDeref((_6.0: &i32)); - CopyForDeref operation for tuple field access
_8 = [move _9]; - Array aggregate construction

Terminators (end basic blocks with control flow)

assert(!move (_3.1: bool), ...) - Assert terminator checking overflow flag with success/unwind branches[5]
_9 = core::fmt::rt::Argument::<'_>::new_display::<i32>(_12) -> [return: bb2, unwind unreachable]; - Call terminator with return destination
return; - Return terminator ending function execution

Rvalues (right-hand side expressions)

Constants: 42_i32, 1_i32
Binary operations: CheckedAdd (other examples would include Sub, Mul, etc.)
References: &_1
Aggregates: tuples (move _7), arrays [move _9]
Projections: (_3.0: i32), (_3.1: bool), (_6.0: &i32) - tuple field accesses

Attributes and Metadata

These provide additional context but don't execute operations:

Debug Information

debug x => 42_i32;
debug y => _1;
debug args => _6;
debug args => _8;

These map source-level variable names (x, y, args) to MIR locals or values, enabling debuggers to show meaningful variable names[3][1].

Source Information

{alloc4<imm>: &[&str; 2]} - allocation with immutability attribute
Type annotations: (_3.0: i32) includes type information for clarity
Unwind attributes: unwind unreachable indicates panic is not expected to be caught

Control Flow Annotations

[success: bb1, unwind unreachable] - branch targets for assert
[return: bb2, unwind unreachable] - call return destinations

Structure Summary

Basic Blocks (bb0 through bb4)

Each basic block is a region containing:

Zero or more statements (operations without control flow)
Exactly one terminator (control flow operation)

Locals (_0 through _12)

Variable declarations at the top serve as SSA-like values, though MIR technically allows reassignment (more like registers than pure SSA)[3].

The demo example in rustc_public

When the run! macro is called, it triggers a chain of function calls that sets up the Rust compiler, runs analysis, and executes our callback with access to compiler internals.

1. Entry Point: The `run!` Macro

#[macro_export]
macro_rules! run {
    ($args:expr, $callback_fn:ident) => {
        $crate::run_driver!($args, || $callback_fn())
    };
    ($args:expr, $callback:expr) => {
        $crate::run_driver!($args, $callback)
    };
}

What it does: Simply delegates to run_driver! macro, wrapping function identifiers in closures.

2. The `run_driver!` Macro - Core Driver Setup

macro_rules! run_driver {
    ($args:expr, $callback:expr $(, $with_tcx:ident)?) => {{
        pub struct RustcPublic<B = (), C = (), F = fn(...) -> ControlFlow<B, C>>
        where
            B: Send,
            C: Send,
            F: FnOnce(...) -> ControlFlow<B, C> + Send,
        {
            callback: Option<F>,
            result: Option<ControlFlow<B, C>>,
        }
        ...

Key Type: `RustcPublic<B, C, F>`

Type Parameters:

B: Break value type (when callback returns ControlFlow::Break(B))
C: Continue value type (when callback returns ControlFlow::Continue(C))
F: The callback function type

Fields:

callback: Option<F> - Stores the user's callback (taken once during execution)
result: Option<ControlFlow<B, C>> - Stores the callback's return value

3. RustcPublic::run() Method

pub fn run(&mut self, args: &[String]) -> Result<C, CompilerError<B>> {
    let compiler_result = rustc_driver::catch_fatal_errors(|| -> interface::Result::<()> {
        run_compiler(&args, self);
        Ok(())
    });
    ...
}

What it does:

Calls rustc_driver::run_compiler() (from the actual Rust compiler)
Passes self (which implements the Callbacks trait)
The compiler will call back into after_analysis() at the right time

4. The Callbacks Trait Implementation

impl<B, C, F> Callbacks for RustcPublic<B, C, F>
where
    B: Send,
    C: Send,
    F: FnOnce(...) -> ControlFlow<B, C> + Send,
{
    fn after_analysis<'tcx>(
        &mut self,
        _compiler: &interface::Compiler,
        tcx: TyCtxt<'tcx>,
    ) -> Compilation {
        if let Some(callback) = self.callback.take() {
            rustc_internal::run(tcx, || {
                self.result = Some(callback(...));
            })
            .unwrap();
            ...
        }
    }
}

What it does:

This is called by rustc after type checking and analysis but before code generation
Receives TyCtxt<'tcx> - the compiler's type context with lifetime 'tcx
Calls rustc_internal::run() to set up the bridge

5. rustc_internal::run() - Bridge Setup

pub fn run<F, T>(tcx: TyCtxt<'_>, f: F) -> Result<T, Error>
where
    F: FnOnce() -> T,
{
    let compiler_cx = RefCell::new(CompilerCtxt::new(tcx));
    let container = Container { 
        tables: RefCell::new(Tables::default()), 
        cx: compiler_cx 
    };

    crate::compiler_interface::run(&container, || init(&container, f))
}

Key Types Created Here:

CompilerCtxt<'tcx> (from rustc_public_bridge)

Wraps the TyCtxt<'tcx> from rustc
Provides methods to query compiler information
Lifetime 'tcx ties it to the compiler's type context

Tables<'tcx, B: Bridge> (from rustc_public_bridge)

Bidirectional mapping between rustc internal types and stable API types
Caches conversions to avoid redundant work
Generic over B: Bridge trait

Container<'tcx, B: Bridge> (from rustc_public_bridge)

pub struct Container<'tcx, B: Bridge> {
    pub tables: RefCell<Tables<'tcx, B>>,
    pub cx: RefCell<CompilerCtxt<'tcx, B>>,
}

Why RefCell?

Allows interior mutability
Multiple parts of code need mutable access to tables/context
Checked at runtime (will panic if borrowed incorrectly)

6. Two nested thread-local scopes:

demo/src/main.rs
  main()
    └─► run!(&rustc_args, start_demo)                     [macro expands]
         └─► run_driver!(...)                             [creates RustcPublic callback wrapper]
              └─► rustc_driver::run_compiler()            [rustc compiles & analyzes code]
                   └─► after_analysis(tcx)                [callback hook after analysis]
                        └─► rustc_internal::run(tcx, || callback())
                             │
                             ├─ Creates: Container { tables, compiler_cx }
                             │
                             └─► compiler_interface::run(&container, || init(&container, f))
                                  │                                         │
                                  ├─ OUTER: Sets CompilerInterface TLV      │
                                  │                                         │
                                  └─────────────────────────────────────────┤
                                                                            │
                                                                            ├─ INNER: Sets Container TLV
                                                                            │
                                                                            └─► f() → start_demo()

What Happens at rustc_internal::run

pub fn run<F, T>(tcx: TyCtxt<'_>, f: F) -> Result<T, Error> {
    let compiler_cx = RefCell::new(CompilerCtxt::new(tcx));
    let container = Container { tables: RefCell::new(Tables::default()), cx: compiler_cx };
    
    crate::compiler_interface::run(&container, || init(&container, f))
    //                              ^^^^^^^^^^      ^^^^^^^^^^^^^^^^^^^^
    //                              OUTER SCOPE     INNER SCOPE
}

Two nested thread-local scopes:

OUTER: compiler_interface::run(&container, ...)
- Sets TLV = pointer to CompilerInterface
- Enables high-level API queries
INNER: init(&container, f)
- Sets TLV = pointer to Container (tables + compiler context)
- Enables translation between stable ↔ internal types
Finally: User Callback Executes
- start_demo() runs with both thread-locals set
- Can call rustc_public::local_crate(), all_local_items(), etc.
- These APIs use the thread-locals to access compiler state

The two-layer thread-local setup happens in this single line:

compiler_interface::run(&container, || init(&container, f))
//                                      ^^^^^^^^^^^^^^^^^^^^
//                                      Inner scope wraps user callback

Both scopes need the same &container, but they set different thread-local variables to make different parts of the system work!

7. Accessing the Context: `with_container()` and `with()`

`rustc_internal::with_container()`

pub(crate) fn with_container<R, B: Bridge>(
    f: impl for<'tcx> FnOnce(&mut Tables<'tcx, B>, &CompilerCtxt<'tcx, B>) -> R,
) -> R {
    assert!(TLV.is_set());
    TLV.with(|tlv| {
        let ptr = tlv.get();
        assert!(!ptr.is_null());
        let container = ptr as *const Container<'_, B>;
        let mut tables = unsafe { (*container).tables.borrow_mut() };
        let cx = unsafe { (*container).cx.borrow() };
        f(&mut *tables, &*cx)
    })
}

What it does:

Retrieves the Container from thread-local storage
Borrows tables mutably and cx immutably
Calls the provided closure with both

`compiler_interface::with()`

pub(crate) fn with<R>(f: impl FnOnce(&dyn CompilerInterface) -> R) -> R {
    assert!(TLV.is_set());
    TLV.with(|tlv| {
        let ptr = tlv.get();
        assert!(!ptr.is_null());
        f(unsafe { *(ptr as *const &dyn CompilerInterface) })
    })
}

What it does:

Retrieves the CompilerInterface trait object from thread-local storage
Calls the provided closure with it

Summary: Two Different TLVs, Two Different Access Patterns

Function	TLV Used	What It Accesses	When Called
`with`	OUTER (compiler_interface)	`&dyn CompilerInterface` (Container)	High-level API calls like `local_crate()`, `all_local_items()`
`with_container`	INNER (rustc_internal)	`Tables` + `CompilerCtxt`	Type conversions between stable ↔ internal

The Flow:

start_demo()
  │
  ├─► rustc_public::local_crate()
  │    └─► with(|cx| cx.local_crate())
  │         └─► Accesses OUTER TLV → gets Container
  │              └─► Container::local_crate() → queries CompilerCtxt
  │
  ├─► rustc_public::all_local_items()
  │    └─► with(|cx| cx.all_local_items())
  │         └─► Accesses OUTER TLV → gets Container
  │              └─► Container::all_local_items() → queries CompilerCtxt
  │                   └─► Internally may call .stable() on items
  │                        └─► with_container(|tables, cx| ...)
  │                             └─► Accesses INNER TLV → gets Tables + CompilerCtxt
  │
  └─► rustc_public::entry_fn()
       └─► with(|cx| cx.entry_fn())
            └─► Accesses OUTER TLV → gets Container

Both TLVs point to the same Container, but they're accessed through different scoped thread-local variables to separate concerns between:

High-level queries (via with)
Type translation (via with_container)

8. The CompilerInterface Trait

pub(crate) trait CompilerInterface {
    fn entry_fn(&self) -> Option<CrateItem>;
    fn all_local_items(&self) -> CrateItems;
    fn mir_body(&self, item: DefId) -> mir::Body;
    fn has_body(&self, item: DefId) -> bool;
    // ... many more methods
}

Implemented by: Container<'tcx, BridgeTys>

What it provides:

High-level API for querying compiler information
All methods internally use tables and cx to convert between internal and stable types

Complete Call Chain Summary

User Code
  ↓
run!(args, callback)
  ↓
run_driver! macro
  ↓
RustcPublic::new(callback)
  ↓
RustcPublic::run(args)
  ↓
rustc_driver::run_compiler(args, self)  ← Enters rustc
  ↓
[Rustc runs parsing, type checking, analysis...]
  ↓
RustcPublic::after_analysis(tcx)  ← Callback from rustc
  ↓
rustc_internal::run(tcx, || { ... })
  ├─ Creates CompilerCtxt::new(tcx)
  ├─ Creates Container { tables, cx }
  └─ Calls compiler_interface::run(&container, ...)
      ├─ Sets TLV #1 (compiler_interface::TLV) → pointer to Container as CompilerInterface
      └─ Calls rustc_internal::init(&container, ...)
          ├─ Sets TLV #2 (rustc_internal::TLV) → pointer to Container
          └─ Executes user callback
              ├─ User calls stable_mir APIs
              ├─ APIs call compiler_interface::with() → retrieves Container via TLV #1
              ├─ APIs call with_container() → retrieves Container via TLV #2
              └─ Container uses tables + cx to convert types

Key Design Patterns

Double Thread-Local Storage
- TLV #1 (compiler_interface::TLV): Stores &dyn CompilerInterface
- TLV #2 (rustc_internal::TLV): Stores &Container<'tcx, B>
- Both point to the same Container, but provide different access patterns
Interior Mutability with RefCell
- Container uses RefCell for both tables and cx
- Allows multiple borrows throughout the call stack
- Runtime borrow checking prevents conflicts
Lifetime Management
- 'tcx lifetime ties everything to the compiler's type context
- Ensures stable API types don't outlive the compiler session
- Scoped thread locals ensure cleanup
Bridge Pattern
- Container acts as a bridge between rustc internals and stable API
- Tables caches conversions
- CompilerCtxt wraps TyCtxt and provides query methods

Notes on Architecture

Safety: Thread-local storage ensures the compiler context is only accessible during valid compilation
Ergonomics: Users don't need to pass context explicitly everywhere
Flexibility: Two TLVs allow different access patterns (trait object vs concrete type)
Performance: Tables caches conversions to avoid redundant work
Separation: Clear boundary between rustc internals and stable API

This architecture allows rustc_public to provide a stable API while internally working with rustc's unstable internals, all while maintaining safety and ergonomics.

Stable-mir dialect in Pliron

The fundamental structure of stable_mir (now rustc_public) is very similar to unstable MIR, but with key differences focused on stability and API design[1][2]. Things we need to know about stable_mir/rustc_public for creating a dialect in pliron:

Core Structure (Types, Operations, Interfaces)

Stable_mir maintains the same conceptual model as unstable MIR with these key components[2][3]:

Types

Body: The IR representation of a single function
BasicBlock: Control-flow graph nodes containing statements and terminators
Local: Local variables with type information (indexed via Local type alias)
Place: Memory locations (variables, fields, derefs) with projections
Type system: Full Rust type information (though simplified from HIR)

Operations

Statements (StatementKind): Non-control-flow operations like assignments, storage management (StorageLive/StorageDead), and no-ops
Terminators (TerminatorKind): Control-flow operations (return, call, switch, goto, drop, etc.)
Rvalues: Right-hand side expressions including binary operations (BinOp), unary operations (UnOp), aggregates, casts, and references
Operands: Values used in operations (constants, moves, copies)

Additional Elements (Similar to Attributes)

ProjectionElem: Field accesses, derefs, array indexing
AggregateKind: Tuple, array, ADT construction
CastKind: Type conversions
BorrowKind, Mutability: Ownership and mutability annotations
SourceInfo: Debug and span information
VarDebugInfo: Variable debugging metadata

Key Differences from Unstable MIR

Stability Guarantees

The main difference is that stable_mir/rustc_public aims to provide semantic versioning and a stable API surface[1][4][5]. The internal rustc MIR can change arbitrarily between compiler versions, while stable_mir will maintain backward compatibility.

API Design

Context management: The TyCtxt compiler context is hidden from users in stable_mir, managed through thread-local storage and accessed via with() function[1]
Cleaner interfaces: Simplified APIs that reduce the need to understand deep compiler internals
Conversion layer: The rustc_smir crate handles translation between internal MIR and stable_mir, isolating users from internal changes[4]
rustc_internal module: Provides internal() and stable() methods for bidirectional conversion when needed (though unstable)[1]

Coverage

Stable_mir currently has less coverage than full unstable MIR, focusing on what static analysis tools need[1][4]. Some advanced or compiler-internal features may not yet be exposed.

What do we need for the Pliron MIR Dialect

When modeling this in pliron:

Operations: Create pliron ops for each StatementKind (Assign, StorageLive/Dead, etc.) and TerminatorKind (Return, Call, Assert, Goto, etc.)
Types: Model MIR's type system as pliron types (primitives, tuples, references, arrays, ADTs)
Attributes: Attach debug info (VarDebugInfo), source spans (SourceInfo), mutability/borrow kinds, and allocation metadata as pliron attributes
Blocks/Regions: Map basic blocks to pliron blocks with appropriate control flow
Operands: Model places (locals with projections) and constants as SSA values or special operand types

The key point is that statements and terminators are operations, locals and expressions have types, and debug/source/flow metadata are attributes.

The structure is conceptually the same—stable_mir just provides a stable, versioned API surface over the same underlying concepts that unstable MIR exposes[8][2][3].

Sources

[1] Migrating to StableMIR - The Kani Rust Verifier https://model-checking.github.io/kani/stable-mir.html
[2] rustc_public - Rust https://doc.rust-lang.org/nightly/nightly-rustc/rustc_public/index.html
[3] rustc_public::mir https://doc.rust-lang.org/nightly/nightly-rustc/rustc_public/mir/index.html
[4] StableMIR - Release and Stability Proposal https://hackmd.io/@celinaval/H1lJBGse0
[5] Publish first version of StableMIR on crates.io - Rust Project ... https://rust-lang.github.io/rust-project-goals/2025h1/stable-mir.html
[6] vaivaswatha/pliron: An Extensible Compiler IR Framework https://github.com/vaivaswatha/pliron
[7] Pliron as the MLIR Alternative (No C/C++) – 1 https://www.youtube.com/watch?v=rRgYGBAhKQ0
[8] The MIR (Mid-level IR) - Rust Compiler Development Guide https://rustc-dev-guide.rust-lang.org/mir/index.html

nihalpasham/stable-mir.md

stable-mir

Disclaimer

MIR - Rust’s mid-level IR

stable-mir

stable-mir design

rustc_public impl

1. Driver Integration via Macros

2. Thread-Local Context Management

3. Stable/Unstable Translation Bridge

4. Visitor Pattern for MIR Analysis

Callback Execution Flow

Internals:

What does MIR look like

Types in the MIR

Operations in the MIR

Statements (within basic blocks)

Terminators (end basic blocks with control flow)

Rvalues (right-hand side expressions)

Attributes and Metadata

Debug Information

Source Information

Control Flow Annotations

Structure Summary

Basic Blocks (bb0 through bb4)

Locals (_0 through _12)

The demo example in rustc_public

1. Entry Point: The run! Macro

2. The run_driver! Macro - Core Driver Setup

Key Type: RustcPublic<B, C, F>

3. RustcPublic::run() Method

4. The Callbacks Trait Implementation

5. rustc_internal::run() - Bridge Setup

Key Types Created Here:

6. Two nested thread-local scopes:

7. Accessing the Context: with_container() and with()

rustc_internal::with_container()

compiler_interface::with()

Summary: Two Different TLVs, Two Different Access Patterns

The Flow:

8. The CompilerInterface Trait

Complete Call Chain Summary

Key Design Patterns

Notes on Architecture

Stable-mir dialect in Pliron

Core Structure (Types, Operations, Interfaces)

Types

Operations

Additional Elements (Similar to Attributes)

Key Differences from Unstable MIR

Stability Guarantees

API Design

Coverage

What do we need for the Pliron MIR Dialect

Sources

1. Entry Point: The `run!` Macro

2. The `run_driver!` Macro - Core Driver Setup

Key Type: `RustcPublic<B, C, F>`

7. Accessing the Context: `with_container()` and `with()`

`rustc_internal::with_container()`

`compiler_interface::with()`