Introduction

This lesson aims to teach the reader about the inner-workings of Rust's *(deref operator) and trait Deref, as well as how to take advantage of those features.

Requirements

This lesson assumes that:

You have amateurish understanding of Rust's type system. Most importantly, understand that references are types, meaning &T, &mut T and T are all different types.

Terminology

A lot of terms used in this lesson include the word "reference". For disambiguation purposes, I'll explicitly list here how I'll refer to each term for the remainder of the lesson.

Original Term	Description	How it may be referred to
trait Deref	The `Deref` trait (std::ops::Deref)	reference trait
Deref::Target	The associated type `Target` of the reference trait	Target, `<T as Deref>::Target`, `<T as DerefMut>::Target`
*	The deref operator	deref operator
&T	A shared reference to a variable of type T	&T, shared reference or borrowed value
&mut T	A unique reference to a variable of type T	&mut, mutable reference or mutably borrowed value
&	The borrow operator, used to create shared references	&, borrow or & operator
&mut	The mutable borrow operator, used to create mutable references	&mut, mutably borrow or &mut operator

The reference trait

pub trait Deref {
    /// The resulting type after dereferencing.
    type Target: ?Sized;

    /// Dereferences the value.
    fn deref(&self) -> &Self::Target;
}

With the trait being called Deref, it might be confusing to see that the function deref returns a reference, not a de-referenced value.

A different way of looking at it is:

Given a type T, implementing Deref for T is a way of telling the compiler that &T may also be referenced as &Target.

By itself, this trait doesn't do much, but the Rust compiler uses its implementations in multiple ways, which I'll explain later on.

Some builtin implementations of the reference trait

Type	Deref::Target
&T	T
&mut T	T
Rc	T
Box	T

Compiler usage #1

Implicitly invoking deref() to access fields or invoke methods on Target.

Consider the following example:

struct Vector2 { x: f32, y: f32 }

fn main() {
    let boxed: Box<Vector2> = Box::new(Vector2 { x: 5, y: 10 });
    println!("X: {}", boxed.x);
}

The type Box<Vector2> does not have a field named x, yet, this example compiles, why is that?

A: Given a type T:

When accessing a field or invoking a method on a variable of type `T`, the Rust compiler will 
first check if `T` has a field/method with the provided name.

If `T` does not have any fields/methods with the provided name, the compiler will then check
if `<T as Deref>::Target` has a field/method with the provided name, using that instead if it exists.

With that in mind, the example above generates code identical to:

struct Vector2 { x: f32, y: f32 }

fn main() {
    let boxed: Box<Vector2> = Box::new(Vector2 { x: 5, y: 10 });
    println!("X: {}", boxed.deref().x);
}

The compiler allows us to omit the deref() invocation here.

The reference trait is quite useful for reducing boilerplate code. If it didn't exist, you would have to somehow get a reference to T(Vector2 in the example above) when accessing fields or invoking methods on the value that's inside a Box<T>, the same applies for most wrapper types (Rc, Arc, Mutex, etc.).

Compiler usage #2

Implicitly invoking Deref::deref() when using the operators &(borrow) or &mut(mutably borrow).

Consider the following example:

fn main() {
    let boxed: Box<i32> = Box::new(5);
    takes_int(&boxed);
}

fn takes_int(input: &i32) {
    println!("Input: {input}");
}

The function takes_int requires an input of type &i32, but we are calling it with a parameter of type &Box<i32>, yet, this example compiles, why is that?

A: Given a type T:

When using the operators `&(borrow)` or `&mut(mutably borrow)`, the compiler will check if the type
`&T`(or `&mut T` if using the `&mut operator`) satisfies the type "conditions" in that specific context.

If the type `&T` does not satisfy those conditions, the compiler will then check if `&Target` (or
`&mut Target` if using the `&mut operator`) satisfies the conditions, using that instead if it does.

With that in mind, the example above generates code identical to:

fn main() {
    let boxed: Box<i32> = Box::new(5);
    takes_int(boxed.deref());
}

fn takes_int(input: &i32) {
    println!("Input: {input}");
}

The compiler replaces the & operator with an invocation of <i32 as Deref>::deref().

The *(deref) operator

Unlike with the & and &mut operators, *(deref) interacts directly with the reference trait. As such, it can only be used on types that implement that trait.

In an oversimplified way, the *(deref) operator does the following, given a type T:

Invokes <T as Deref>::deref(), which returns a variable of type &Target.
Accesses the value that the returned reference points to, which will be of type Target.

Compiler usage #3

Given a type T that implements the reference trait, the *(deref) operator can be used to move Target out of a variable of type T

Consider the example:

fn main() {
    let boxed: Box<&str> = Box::new("boxed str");
    let deref_result: &str = *boxed;
}

Given any type T, Box<T> implements Deref<Target = T>. This means that using the *(deref) operator on Box<T> will return a variable of type T, essentially opening the box.

This usage is allowed on all types that implement the reference trait, except &T and &mut T, moving out of references is forbidden (in other words: you can only move out of variables you own).

Compiler usage #4

Given a type T that implements DerefMut, the *(deref) operator can be used to replace the value Target inside a variable of type T

The same applies for any of the assignment operators (=, +=, -=, etc.)

Consider the example:

fn increment(input: &mut i32) {
    *input += 1; 
}

We know that for any given T, &mut T implements Deref<Target = T> and DerefMut.

In the example above, the *(deref) operator is used to replace the value of type i32 that input points to.

This isn't limited to the implementation of &mut T, consider the example:

fn main() {
    let mut boxed: Box<Vec<i32>> = Box::new(vec![2, 5]);
    *boxed = vec![3, 4]; // valid
    
    *boxed = Box::new(vec![3, 5]); // invalid! We can only replace `Target`
}

Given any type T, Box<T> implements DerefMut<Target = T>. This means we can use the deref operator here to replace the value Target inside a variable of type Box.

Since we are mutating variables, usage #4 requires for the outermost type to be mutable (mutable references are inherently mutable, there's no need for them to be preceded by the keyword mut - like let mut mut_ref = &mut 5;).

Compiler usage #5

Given a type T that implements Deref, and Deref::Target implements Copy, the *(deref) operator can be used on T to get a clone of Target.

Consider the example:

fn main() {
    let x: &i32 = &10;
    let cloned_x: i32 = *x;
    println!("X: {x}, Clone: {cloned_x}");
}

Since &i32 implements Deref<Target = i32>, we can use the *(deref) operator to get a clone of Target(i32), in this case, the value 10.

With this in mind, the following example generates code identical to the above:

fn main() {
    let x: &i32 = &10;
    let cloned_x: i32 = x.deref().clone();
    println!("X: {x}, Clone: {cloned_x}");
}

Usage #5 is not exclusive to references, the following example is also valid:

fn main() {
    let boxed_int: Box<i32> = Box::new(7);
    let cloned_int: i32 = *boxed_int;
    println!("Boxed: {boxed_int}, Clone: {cloned_int}");
}

Just like &i32, Box<i32> implements Deref<Target = i32>, which allows us to use the *(deref) operator to clone Target.

Simple Example - Implementing a bounded integer type

Supposed we want to ensure an integer is always between 0 ~ 100, we can easily enforce that by wrapping the integer in a type with a private field:

pub struct Percent {
    inner: u8,
}

impl Percent {
    pub fn new(value: u8) -> Percent {
        // enforce that inner is always between 0 ~ 100
        let inner = u8::clamp(inner, 0, 100);
        
        Percent { inner }
    }
    
    // We can provide a public `set` method to allow users to mutate inner, while still enforcing the bounds
    pub fn set(&mut self, value: u8) {
        *self = Self::new(value); // Quietly using #4 on `&mut Percent`
    }
}

But how would the user access inner?

A basic implementation would be adding a get method to provide readonly access to it:

impl Percent {
    pub fn get(&self) -> u8 {
        self.inner
    }
}

Although that is fine, if this type is frequently used, invoking get() can quickly get verbose/tiring.

We can solve this by implementing Deref, taking advantage of the compiler usages #1, #2, #3, and #5:

impl Deref for Percent {
    type Target = u8;
    
    fn deref(&self) -> &Self::Target {
        &self.inner
    }
}

Which then gives users readonly view of Percent::inner:

fn main() {
    let percent = Percent::new(200);
    
    // Usage #1: the compiler invokes `deref` to access `Target`, which we can invoke `u8::count_zeros()` on
    let zeros = percent.count_zeros();
    
    // Usage #2: the compiler invokes deref to borrow `Target` to the function `print_int()`
    print_int(&percent);
    
    // Usage #5: we clone `Target` which returns an `u8`, which we can compare to `100`
    if *percent == 100 {  
        println!("Maximum percent!");
    }
    
    // Usage #3: we use deref to reference -> move `Target` out of `Percent` 
    let inner: u8 = *percent;
}

fn print_int(int: &u8) {
    println!("Int: {int}");
}

In this example, it's important to note that we do not want to implement DerefMut for Percent, this would allow anyone to replace Percent::inner, bypassing the constraints enforced on Percent::new().

Advanced Example - Using Deref to "ignore" generics

If you're familiar with the type-state pattern, you might have come across a situation where you need to store the possible states somewhere, which can be done by using enum variants:

pub struct Npc<T> {
    name: String,
    max_health: i32,
    health: i32,
    state: T,
}

pub struct Idle;
pub struct Charging { time_remaining: f32 }

pub enum NpcEnum {
    Idle(Npc<Idle>),
    Charging(Npc<Charging>),
}

Imagine we have a variable of type NpcEnum, then in a specific context, we want to access its field name, but we don't really care about state.

A "brute-force" implementation could be done by defining a getter-method name(&self) -> &str that matches on NpcEnum to return such field:

impl NpcEnum {
    pub fn name(&self) -> &str {
        match self {
            NpcEnum::Idle(this) => &this.name,
            NpcEnum::Charging(this) => &this.name,
        }
    }
}

Although that is fine, you'll have a lot of maintenance to do whenever you add new states or new fields.

This can be solved by implementing the reference trait for NpcEnum:

impl Deref for NpcEnum {
    type Target = Npc<dyn std::any::Any>;
    
    fn deref(&self) -> &Self::Target {
        match self {
            NpcEnum::Idle(this) => this,
            NpcEnum::Charging(this) => this,
        }
    }
}

// Note: since `dyn Any` does not implement `Sized`, 
// the implementation above requires relaxing the bounds on `T`:
pub struct Npc<T: ?Sized> {
    name: String,
    max_health: i32,
    health: i32,
    state: T,
}

With that implementation, we can access any fields/methods of Npc as long as those don't require state:

fn main() {
    let npc = Npc {
        name: Houtamelo,
        max_health: 69, //nice
        health: 7,
        state: Idle,
    };
    
    let npc_enum = NpcEnum::Idle(npc);
    let name = &npc_enum.name;
    let max_health = npc_enum.max_health;
    let health = npc_enum.health;
    
    println!(
        "Npc stats:\n\
         \tName: {name}\n\
         \tMax Health: {max_health}\n\
         \tHealth: {health}\n"
    );
}

This also means that adding more fields/states does not require any additional maintenance.

A word of caution

Deref is a tool, and like any other tool, it doesn't fit all cases.

Given a type T, implementing Deref can cause problems if both T and Target implement the same trait or have a method with the same signature.

Consider the example:

fn main() {
    let rc: Rc<Vec<i32>> = Rc::new(vec![5, 3]);
    let clone = rc.clone();
}

Both Rc<Vec<i32>> and Vec<i32> implement Clone. Since Rc<Vec<i32>> implements Deref<Target = i32>, which implementation is being called here? If you check usage #1, you can deduce that Rc::clone is the one prioritized by the compiler, but that's still implicit, and it may be confusing for other people reading your code (or your future self).

However, having a few overlapping implementations between T and Target is almost impossible to dodge, in those cases, I recommend explicitly stating which implementation is being called:

fn main() {
    let rc: Rc<Vec<i32>> = Rc::new(vec![5, 3]);
    let cloned_rc = Rc::clone(&rc);
}

Check this section of the official book for more "words" of caution.

The end

I hope you learned something, any feedback is appreciated.

@yar999

Thanked your reply @Houtamelo

I follow the Advanced Example - Using Deref to "ignore" generics section of the source code test, found that can not compile, I couldn't fix it, so I changed it to the above code.

use std::ops::Deref;

pub struct Idle;
pub struct Charging {
    time_remaining: f32,
}

pub enum NpcEnum {
    Idle(Npc<Idle>),
    Charging(Npc<Charging>),
}

impl Deref for NpcEnum {
    type Target = dyn std::any::Any;

    fn deref(&self) -> &Self::Target {
        match self {
            NpcEnum::Idle(this) => this,
            NpcEnum::Charging(this) => this,
        }
    }
}

// Note: since `dyn Any` does not implement `Sized`,
// the implementation above requires relaxing the bounds on `T`:
pub struct Npc<T: ?Sized> {
    name: String,
    max_health: i32,
    health: i32,
    state: T,
}

fn main() {
    let npc = Npc {
        name: "Houtamelo".to_string(),
        max_health: 69, //nice
        health: 7,
        state: Idle,
    };

    let npc_enum = NpcEnum::Idle(npc);
    let name = &npc_enum.name;
    let max_health = npc_enum.max_health;
    let health = npc_enum.health;

    println!(
        "Npc stats:\n\
         \tName: {name}\n\
         \tMax Health: {max_health}\n\
         \tHealth: {health}\n"
    );
}

The above code compiles with errors:

➜ cargo run
   Compiling yst v0.1.0 (/Users/ys/ws/rust/yst)
error[E0609]: no field `name` on type `NpcEnum`
  --> src/main.rs:42:26
   |
42 |     let name = &npc_enum.name;
   |                          ^^^^ unknown field

error[E0609]: no field `max_health` on type `NpcEnum`
  --> src/main.rs:43:31
   |
43 |     let max_health = npc_enum.max_health;
   |                               ^^^^^^^^^^ unknown field

error[E0609]: no field `health` on type `NpcEnum`
  --> src/main.rs:44:27
   |
44 |     let health = npc_enum.health;
   |                           ^^^^^^ unknown field

For more information about this error, try `rustc --explain E0609`.
error: could not compile `yst` (bin "yst") due to 3 previous errors

Oh, I put in the wrong type there, Target should be Npc<dyn Any>, not dyn Any:

impl Deref for NpcEnum {
    type Target = Npc<dyn std::any::Any>; // <<< the type was wrong
    
    fn deref(&self) -> &Self::Target {
        match self {
            NpcEnum::Idle(this) => this,
            NpcEnum::Charging(this) => this,
        }
    }
}

I've updated the gist with this fix, thanks for reporting it!

Houtamelo/lesson_deref.md

Introduction

Requirements

Terminology

The reference trait

Some builtin implementations of the reference trait

Compiler usage #1

Compiler usage #2

The *(deref) operator

Compiler usage #3

Compiler usage #4

Compiler usage #5

Simple Example - Implementing a bounded integer type

Advanced Example - Using Deref to "ignore" generics

A word of caution

The end

linyihai commented Aug 6, 2024

Uh oh!

Houtamelo commented Aug 7, 2024

Uh oh!

yar999 commented Aug 7, 2024

Uh oh!

Houtamelo commented Aug 8, 2024 •

edited

Loading

Uh oh!

yar999 commented Aug 8, 2024

Uh oh!

Houtamelo commented Aug 8, 2024 •

edited

Loading

Uh oh!

Houtamelo/lesson_deref.md

Introduction

Requirements

Terminology

The reference trait

Some builtin implementations of the reference trait

Compiler usage #1

Compiler usage #2

The *(deref) operator

Compiler usage #3

Compiler usage #4

Compiler usage #5

Simple Example - Implementing a bounded integer type

Advanced Example - Using Deref to "ignore" generics

A word of caution

The end

linyihai commented Aug 6, 2024

Uh oh!

Houtamelo commented Aug 7, 2024

Uh oh!

yar999 commented Aug 7, 2024

Uh oh!

Houtamelo commented Aug 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yar999 commented Aug 8, 2024

Uh oh!

Houtamelo commented Aug 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Houtamelo commented Aug 8, 2024 •

edited

Loading

Houtamelo commented Aug 8, 2024 •

edited

Loading