Skip to content

Instantly share code, notes, and snippets.

@nihalpasham
Created April 3, 2026 04:58
Show Gist options
  • Select an option

  • Save nihalpasham/f64ccbefbc0aade22e5f98b31c7ddaef to your computer and use it in GitHub Desktop.

Select an option

Save nihalpasham/f64ccbefbc0aade22e5f98b31c7ddaef to your computer and use it in GitHub Desktop.
Your `for` Loop Is a Lie: How Rust Desugars Iterators in MIR

Your for Loop Is a Lie: How Rust Desugars Iterators in MIR

Table of Contents

  1. The Code
  2. Dumping the MIR
  3. The Desugaring Chain
  4. Anatomy of the MIR
  5. What Iterator::next() Does Internally
  6. Checked Arithmetic: Why sum += i Returns a Tuple
  7. Slice Iterators: Where PhantomData Appears
  8. Multi-Level Field Projections
  9. Summary: What a for Loop Actually Requires
  10. Key Takeaways



The Code

Three lines. One loop. Adds up the numbers from 0 to n-1.

fn sum_range(n: u32) -> u32 {
    let mut sum: u32 = 0;
    for i in 0..n {
        sum += i;
    }
    sum
}

Everyone has written this loop. The question is: what does it become?


Dumping the MIR

Anyone can follow along:

cargo +nightly rustc -- -Zunpretty=mir

The MIR output for sum_range:

fn sum_range(_1: u32) -> u32 {
    debug n => _1;
    let mut _0: u32;
    let mut _2: u32;
    let mut _3: std::ops::Range<u32>;
    let mut _4: std::ops::Range<u32>;
    let mut _6: std::option::Option<u32>;
    let mut _7: &mut std::ops::Range<u32>;
    let mut _8: isize;
    let mut _10: (u32, bool);
    scope 1 {
        debug sum => _2;
        let mut _5: std::ops::Range<u32>;
        scope 2 {
            debug iter => _5;
            let _9: u32;
            scope 3 {
                debug i => _9;
            }
        }
    }

    bb0: {
        _2 = const 0_u32;
        _4 = std::ops::Range::<u32> { start: const 0_u32, end: copy _1 };
        _3 = <std::ops::Range<u32> as IntoIterator>::into_iter(move _4) -> [return: bb1, unwind continue];
    }

    bb1: {
        _5 = move _3;
        goto -> bb2;
    }

    bb2: {
        _7 = &mut _5;
        _6 = <std::ops::Range<u32> as Iterator>::next(copy _7) -> [return: bb3, unwind continue];
    }

    bb3: {
        _8 = discriminant(_6);
        switchInt(move _8) -> [0: bb6, 1: bb5, otherwise: bb4];
    }

    bb4: {
        unreachable;
    }

    bb5: {
        _9 = copy ((_6 as Some).0: u32);
        _10 = AddWithOverflow(copy _2, copy _9);
        assert(!move (_10.1: bool), "attempt to compute `{} + {}`, which would overflow",
               copy _2, copy _9) -> [success: bb7, unwind continue];
    }

    bb6: {
        _0 = copy _2;
        return;
    }

    bb7: {
        _2 = move (_10.0: u32);
        goto -> bb2;
    }
}

That's 7 basic blocks for a 3-line loop. Let's walk through it.


The Desugaring Chain

The for loop is syntactic sugar. Rust desugars it into an Iterator before MIR generation. There is no for in MIR.

What the Compiler Actually Generates

// What you write:
for i in 0..n {
    sum += i;
}

// What Rust actually compiles:
let mut iter = IntoIterator::into_iter(0..n);
loop {
    match Iterator::next(&mut iter) {
        Some(i) => { sum += i; }
        None => break,
    }
}

Each step in this chain is visible in the MIR:

Step Rust MIR
1. Range construction 0..n Range::<u32> { start: 0_u32, end: _1 }
2. into_iter() (0..n).into_iter() <Range<u32> as IntoIterator>::into_iter(move _4)
3. next() iter.next() <Range<u32> as Iterator>::next(copy _7)
4. Pattern match match ... { Some/None } discriminant(_6) + switchInt
5. Payload extraction Some(i) ((_6 as Some).0: u32)
6. Checked add sum += i AddWithOverflow(copy _2, copy _9)

Anatomy of the MIR

bb0: Range Construction + into_iter()

bb0: {
    _2 = const 0_u32;                                                       // sum = 0
    _4 = std::ops::Range::<u32> { start: const 0_u32, end: copy _1 };       // 0..n
    _3 = <std::ops::Range<u32> as IntoIterator>::into_iter(move _4)         // .into_iter()
         -> [return: bb1, unwind continue];
}

0..n creates a Range<u32> struct with two fields: start and end. Then into_iter() is called on it. For Range, into_iter() is the identity function — it returns self unchanged. But MIR still emits the call because the desugaring is mechanical, not optimized.

The -> [return: bb1, unwind continue] is MIR's way of saying: "if the call returns normally, go to bb1; if it panics, unwind the stack." Every function call in MIR is a terminator that ends the basic block.

bb2: The Loop Header — next()

bb2: {
    _7 = &mut _5;                                                            // &mut iter
    _6 = <std::ops::Range<u32> as Iterator>::next(copy _7)                   // iter.next()
         -> [return: bb3, unwind continue];
}

Every iteration takes a &mut reference to the iterator and calls next(). The return type is Option<u32> — stored in _6.

This is the loop entry point. bb7 (the loop body's end) jumps back here with goto -> bb2.

bb3: Option Pattern Matching via switchInt

bb3: {
    _8 = discriminant(_6);                                                   // get tag
    switchInt(move _8) -> [0: bb6, 1: bb5, otherwise: bb4];                  // match
}

discriminant(_6) extracts the enum discriminant from Option<u32>:

  • 0 = None → bb6 (exit the loop)
  • 1 = Some → bb5 (loop body)
  • otherwise → bb4 (unreachable)

The otherwise arm exists because switchInt operates on an integer, and MIR doesn't know at this stage that only 0 and 1 are valid discriminants. bb4 is just unreachable;.

bb5: The Loop Body

bb5: {
    _9 = copy ((_6 as Some).0: u32);                                         			// extract i
    _10 = AddWithOverflow(copy _2, copy _9);                                 			// sum + i
    assert(!move (_10.1: bool), "attempt to compute `{} + {}`, which would overflow",
           copy _2, copy _9) -> [success: bb7, unwind continue];
}

Three things happen here:

  1. Payload extraction: ((_6 as Some).0: u32) — this is a Downcast projection followed by a Field projection. "Treat _6 as the Some variant, then extract field 0."
  2. Checked arithmetic: AddWithOverflow returns a (u32, bool) tuple — the result and an overflow flag.
  3. Overflow assertion: assert(!(_10.1: bool), ...) — if the overflow flag is true, panic. The assert is itself a terminator.

bb6 + bb7: Exit and Continue

bb6: {
    _0 = copy _2;         // return sum
    return;
}

bb7: {
    _2 = move (_10.0: u32);   // sum = result (extract field 0 from tuple)
    goto -> bb2;               // loop back
}

bb7 extracts the addition result from the overflow tuple and loops back to bb2. bb6 copies sum into the return slot and exits.

Control Flow Diagram

bb0: Range + into_iter()
 │
 ▼
bb1: move iter
 │
 ▼
bb2: next() ◄─────────────┐
 │                        │
 ▼                        │
bb3: switchInt(discr)     │
 ├── 0 (None) → bb6       │
 ├── 1 (Some) → bb5       │
 └── otherwise → bb4      │
                          │
bb5: extract i,           │
     sum + i (checked)    │
 │                        │
 ▼                        │
bb7: sum = result ────────┘

bb6: return sum

bb4: unreachable

What Iterator::next() Does Internally

The MIR above shows next() as an opaque function call. But when a MIR backend collects the function bodies from the standard library, it gets the implementation of Range<u32>::next(). Here's what that looks like conceptually:

impl Iterator for Range<u32> {
    fn next(&mut self) -> Option<u32> {
        if self.start < self.end {
            let value = self.start;
            self.start = self.start + 1;  // becomes checked_add in MIR
            Some(value)
        } else {
            None
        }
    }
}

In MIR, this function introduces:

  • Field access through a mutable reference: (*_1).0 — deref the &mut self pointer, then access field start. This is the Deref → Field projection chain.
  • Checked arithmetic for the counter: checked_add(self.start, 1) returns (u32, bool).
  • Enum construction: Some(value) becomes an Aggregate(Adt) rvalue that constructs Option::Some with a discriminant of 1 and the value as payload.
  • Field assignment through a pointer: (*_1).0 = new_start — store the incremented counter back through the mutable reference. This is a 2-level projection assignment (Deref + Field).

Checked Arithmetic: Why sum += i Returns a Tuple

In MIR, sum += i doesn't become a simple add instruction. It becomes:

_10 = AddWithOverflow(copy _2, copy _9);      // returns (u32, bool)
assert(!move (_10.1: bool), "attempt to compute `{} + {}`, which would overflow",
       copy _2, copy _9) -> [success: bb7, unwind continue];
// ...
_2 = move (_10.0: u32);                       // extract the actual result

The pipeline:

sum += i
   │
   ▼
AddWithOverflow(sum, i)    →    (result: u32, overflow: bool)
   │                                    │              │
   │                                    │              ▼
   │                                    │        assert(!overflow)
   │                                    │              │
   │                                    ▼              │
   │                               _2 = result         │
   │                                    │              │
   ▼                                    ▼              ▼

This means implementing for-loops in a backend requires:

  1. Tuple types(u32, bool) must be representable
  2. Tuple field extraction_10.0 and _10.1 must work
  3. Checked operationsAddWithOverflow must lower to something meaningful

With panic=abort at higher opt levels, the overflow check is often dead code. But MIR always has it.


Slice Iterators: Where PhantomData Appears

Range iterators are the simple case. Slice iterators are where things get interesting.

fn sum_slice(data: &[u32]) -> u32 {
    let mut sum: u32 = 0;
    for val in data.iter() {
        sum += *val;
    }
    sum
}

The MIR looks similar in structure, but the types change:

fn sum_slice(_1: &[u32]) -> u32 {
    let mut _3: std::slice::Iter<'_, u32>;
    let mut _6: std::option::Option<&u32>;     // Option<&u32> — not Option<u32>!
    // ...

    bb0: {
        _4 = core::slice::<impl [u32]>::iter(copy _1)
             -> [return: bb1, unwind continue];
    }

    bb1: {
        _3 = <std::slice::Iter<'_, u32> as IntoIterator>::into_iter(move _4)
             -> [return: bb2, unwind continue];
    }
    // ... same switchInt pattern on Option discriminant ...

    bb6: {
        _9 = copy ((_6 as Some).0: &u32);       // val: &u32
        _10 = copy (*_9);                       // *val: u32 (Deref projection!)
        _11 = AddWithOverflow(copy _2, copy _10);
        // ...
    }
}

The key difference: Iter<'_, u32> is not Range<u32>. It's a struct with internal pointer fields and PhantomData:

pub struct Iter<'a, T: 'a> {
    ptr: NonNull<T>,
    end_or_len: *const T,
    _marker: PhantomData<&'a T>,   // <-- ZST, but real in MIR!
}

When Iter::next() is collected, MIR constructs this struct including PhantomData as a real constant operand:

_7 = Iter(_24, _25, core::marker::PhantomData::<&u32>);

PhantomData<&u32> has no runtime representation — it's zero-sized. But MIR treats it as a real operand with a real type. A backend that can't handle ZST constants will fail here.


What’s a "Projection"?

The term comes from relational algebra / type theory.

Think of it as projecting from a larger structure onto a smaller piece:

Tuple (a, b, c)
       │
       │ "project onto field 1"
       ▼
       b

It's called "projection" because you're projecting the whole value down to a component — like projecting a 3D object onto a 2D plane. You're "narrowing" your view to just one part.

Alternative mental model: Think of it like a path or lens into a data structure:

  • Deref = "follow the pointer"
  • Field(n) = "access field n"
  • Index(local) = "index at position stored in local"

So (*_1)[_4] is really a path: "start at _1, dereference it, then index by _4".


Multi-Level Field Projections

Inside Iter::next(), the field access patterns get deep. The iterator needs to compare and advance its internal pointers:

(*_1).0.1       // Deref → Field(0) → Field(1)

This is a three-level projection chain meaning: dereference the &mut self pointer, access field 0 (which is a NonNull<T>), then access field 1 of that (which is the raw pointer inside NonNull).

MIR represents this as a Place with a projection list:

Place {
    local: _1,
    projection: [Deref, Field(0, NonNull<u32>), Field(1, *const u32)]
}

The naive way to implement this is pattern matching by depth:

if projection.len() == 1 { /* handle single projections */ }
if projection.len() == 2 { /* handle pairs */ }
if projection.len() == 3 { /* added as band-aid when we hit 3-deep */ }

This has exponential complexity — each depth requires handling all combinations of Deref, Field, Downcast, Index. The right approach is iterative:

let mut current = get_base_local();
for projection in projections {
    current = apply(current, projection);  // one step at a time
}

O(N) code handles any depth.


Summary: What a for Loop Actually Requires

Implementing for-loop support in a MIR backend requires getting everything right at once: enums, tuples, ZSTs, checked arithmetic, field projections, core library imports, trait method resolution, and type conversions.

Here's everything for i in 0..n { sum += i; } needs:

Feature Why
Struct types Range<u32> is a struct with start and end fields
Enum types Option<u32> is an enum with None and Some variants
Trait method calls IntoIterator::into_iter(), Iterator::next()
Core library import Range::next() lives in core, not user code
Discriminant extraction discriminant(_6) to get the Option tag
Multi-way switch switchInt with 3 arms (None, Some, otherwise)
Downcast + Field projection ((_6 as Some).0: u32) to extract the payload
Checked arithmetic AddWithOverflow returns (u32, bool)
Tuple types The checked result is a 2-element tuple
Tuple field extraction (_10.0: u32) and (_10.1: bool)
Mutable references &mut _5 to pass the iterator to next()
Deref + Field projections (*_1).0 inside Range::next()
ZST handling PhantomData in Iter (for slice iterators)

That's 13 distinct features for a 3-line loop.


Key Takeaways

  1. There is no for in MIR — it's desugared to into_iter() + loop { match next() { ... } } before MIR generation
  2. Every function call is a block terminatorinto_iter(), next(), even assert each end their basic block
  3. Option pattern matching is switchInt on an integer discriminant — the enum is just a tagged union
  4. Checked arithmetic returns tuplessum += i becomes AddWithOverflow(result, overflow_flag) → assert → extract
  5. Slice iterators bring in PhantomData — a zero-sized type that exists as a real operand in MIR
  6. Projection chains can go arbitrarily deep — process them iteratively, never by depth
  7. The "simple" for-loop is actually a stress test — it exercises nearly every feature a MIR backend needs

Reproducing

# Dump MIR for any crate
cargo +nightly rustc -- -Zunpretty=mir

# Dump MIR for a specific function
cargo +nightly rustc -- -Zunpretty=mir 2>&1 | grep -A 100 "fn sum_range"

Minimal example (src/main.rs):

#[inline(never)]
fn sum_range(n: u32) -> u32 {
    let mut sum: u32 = 0;
    for i in 0..n {
        sum += i;
    }
    sum
}

fn main() {
    println!("{}", sum_range(8));
}

// Rust source (desugared):
fn sum_range(n: u32) -> u32 {        // scope 0: n
    let mut sum = 0;                  // scope 1: sum
    let mut iter = (0..n).into_iter();  // scope 2: iter
    loop {
        match iter.next() {
            Some(i) => sum += i,      // scope 3: i
            None => break,
        }
    }
    sum
}

// MIR scopes:
scope 1 {
    debug sum => _2;
    scope 2 {
        debug iter => _5;
        scope 3 {
            debug i => _9;
        }
    }
}

// Breakpoint on `sum += i` → debugger walks up:
//   scope 3 → i, scope 2 → iter, scope 1 → sum, scope 0 → n

Scopes are purely debuginfo — they tell the debugger which user variable names are visible at a given source location. They have no effect on MIR semantics.


Source Material

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment