This lesson aims to teach the reader about the inner-workings of Rust's *(deref operator)
and trait Deref
, as well as how to take advantage of those features.
This lesson assumes that:
- You have amateurish understanding of Rust's type system. Most importantly, understand that
references are types, meaning
&T
,&mut T
andT
are all different types.
A lot of terms used in this lesson include the word "reference". For disambiguation purposes, I'll explicitly list here how I'll refer to each term for the remainder of the lesson.
Original Term | Description | How it may be referred to |
---|---|---|
trait Deref | The Deref trait (std::ops::Deref) |
reference trait |
Deref::Target | The associated type Target of the reference trait |
Target, <T as Deref>::Target , <T as DerefMut>::Target |
* | The deref operator | deref operator |
&T | A shared reference to a variable of type T | &T, shared reference or borrowed value |
&mut T | A unique reference to a variable of type T | &mut, mutable reference or mutably borrowed value |
& | The borrow operator, used to create shared references | &, borrow or & operator |
&mut | The mutable borrow operator, used to create mutable references | &mut, mutably borrow or &mut operator |
pub trait Deref {
/// The resulting type after dereferencing.
type Target: ?Sized;
/// Dereferences the value.
fn deref(&self) -> &Self::Target;
}
With the trait being called Deref
, it might be confusing to see that the function deref
returns a reference, not a de-referenced value.
A different way of looking at it is:
Given a type
T
, implementingDeref
forT
is a way of telling the compiler that&T
may also be referenced as&Target
.
By itself, this trait doesn't do much, but the Rust compiler uses its implementations in multiple ways, which I'll explain later on.
Type | Deref::Target |
---|---|
&T | T |
&mut T | T |
Rc | T |
Box | T |
Implicitly invoking deref()
to access fields or invoke methods on Target
.
Consider the following example:
struct Vector2 { x: f32, y: f32 }
fn main() {
let boxed: Box<Vector2> = Box::new(Vector2 { x: 5, y: 10 });
println!("X: {}", boxed.x);
}
The type Box<Vector2>
does not have a field named x
, yet, this example compiles, why is that?
A: Given a type T
:
When accessing a field or invoking a method on a variable of type `T`, the Rust compiler will
first check if `T` has a field/method with the provided name.
If `T` does not have any fields/methods with the provided name, the compiler will then check
if `<T as Deref>::Target` has a field/method with the provided name, using that instead if it exists.
With that in mind, the example above generates code identical to:
struct Vector2 { x: f32, y: f32 }
fn main() {
let boxed: Box<Vector2> = Box::new(Vector2 { x: 5, y: 10 });
println!("X: {}", boxed.deref().x);
}
The compiler allows us to omit the deref()
invocation here.
The reference trait is quite useful for reducing boilerplate code. If it didn't exist, you would have
to somehow get a reference to T
(Vector2 in the example above) when accessing fields or invoking
methods on the value that's inside a Box<T>
, the same applies for most wrapper types (Rc, Arc,
Mutex, etc.).
Implicitly invoking Deref::deref()
when using the operators &
(borrow) or &mut
(mutably borrow).
Consider the following example:
fn main() {
let boxed: Box<i32> = Box::new(5);
takes_int(&boxed);
}
fn takes_int(input: &i32) {
println!("Input: {input}");
}
The function takes_int
requires an input of type &i32
, but we are calling it with a parameter
of type &Box<i32>
, yet, this example compiles, why is that?
A: Given a type T
:
When using the operators `&(borrow)` or `&mut(mutably borrow)`, the compiler will check if the type
`&T`(or `&mut T` if using the `&mut operator`) satisfies the type "conditions" in that specific context.
If the type `&T` does not satisfy those conditions, the compiler will then check if `&Target` (or
`&mut Target` if using the `&mut operator`) satisfies the conditions, using that instead if it does.
With that in mind, the example above generates code identical to:
fn main() {
let boxed: Box<i32> = Box::new(5);
takes_int(boxed.deref());
}
fn takes_int(input: &i32) {
println!("Input: {input}");
}
The compiler replaces the &
operator with an invocation of <i32 as Deref>::deref()
.
Unlike with the &
and &mut
operators, *(deref)
interacts directly with the reference trait.
As such, it can only be used on types that implement that trait.
In an oversimplified way, the *(deref)
operator does the following, given a type T
:
- Invokes
<T as Deref>::deref()
, which returns a variable of type&Target
. - Accesses the value that the returned reference points to, which will be of type
Target
.
Given a type T
that implements the reference trait, the *(deref)
operator can be used to move
Target
out of a variable of type T
Consider the example:
fn main() {
let boxed: Box<&str> = Box::new("boxed str");
let deref_result: &str = *boxed;
}
Given any type T
, Box<T>
implements Deref<Target = T>
. This means that using the *(deref)
operator on Box<T>
will return a variable of type T
, essentially opening
the box.
This usage is allowed on all types that implement the reference trait, except &T
and &mut T
,
moving out of references is forbidden (in other words: you can only move out of variables you own).
Given a type T
that implements DerefMut
, the *(deref)
operator can be used to replace the
value Target
inside a variable of type T
The same applies for any of the assignment operators (=
, +=
, -=
, etc.)
Consider the example:
fn increment(input: &mut i32) {
*input += 1;
}
We know that for any given T
, &mut T
implements Deref<Target = T>
and DerefMut
.
In the example above, the *(deref)
operator is used to replace the value of type i32
that input
points to.
This isn't limited to the implementation of &mut T
, consider the example:
fn main() {
let mut boxed: Box<Vec<i32>> = Box::new(vec![2, 5]);
*boxed = vec![3, 4]; // valid
*boxed = Box::new(vec![3, 5]); // invalid! We can only replace `Target`
}
Given any type T
, Box<T>
implements DerefMut<Target = T>
. This means we can use the deref
operator here to replace the value Target
inside a variable of type Box
.
Since we are mutating variables, usage #4 requires for the outermost type to be mutable (mutable
references are inherently mutable, there's no need for them to be preceded by the keyword
mut
- like let mut mut_ref = &mut 5;
).
Given a type T
that implements Deref
, and Deref::Target
implements Copy, the *(deref)
operator
can be used on T
to get a clone of Target
.
Consider the example:
fn main() {
let x: &i32 = &10;
let cloned_x: i32 = *x;
println!("X: {x}, Clone: {cloned_x}");
}
Since &i32
implements Deref<Target = i32>
, we can use the *(deref)
operator to get a clone of
Target
(i32), in this case, the value 10
.
With this in mind, the following example generates code identical to the above:
fn main() {
let x: &i32 = &10;
let cloned_x: i32 = x.deref().clone();
println!("X: {x}, Clone: {cloned_x}");
}
Usage #5 is not exclusive to references, the following example is also valid:
fn main() {
let boxed_int: Box<i32> = Box::new(7);
let cloned_int: i32 = *boxed_int;
println!("Boxed: {boxed_int}, Clone: {cloned_int}");
}
Just like &i32
, Box<i32>
implements Deref<Target = i32>
, which allows us to use the *(deref)
operator to clone Target
.
Supposed we want to ensure an integer is always between 0 ~ 100, we can easily enforce that by wrapping the integer in a type with a private field:
pub struct Percent {
inner: u8,
}
impl Percent {
pub fn new(value: u8) -> Percent {
// enforce that inner is always between 0 ~ 100
let inner = u8::clamp(inner, 0, 100);
Percent { inner }
}
// We can provide a public `set` method to allow users to mutate inner, while still enforcing the bounds
pub fn set(&mut self, value: u8) {
*self = Self::new(value); // Quietly using #4 on `&mut Percent`
}
}
But how would the user access inner
?
A basic implementation would be adding a get
method to provide readonly access to it:
impl Percent {
pub fn get(&self) -> u8 {
self.inner
}
}
Although that is fine, if this type is frequently used, invoking get()
can quickly get verbose/tiring.
We can solve this by implementing Deref
, taking advantage of the compiler usages #1, #2, #3, and #5:
impl Deref for Percent {
type Target = u8;
fn deref(&self) -> &Self::Target {
&self.inner
}
}
Which then gives users readonly view of Percent::inner
:
fn main() {
let percent = Percent::new(200);
// Usage #1: the compiler invokes `deref` to access `Target`, which we can invoke `u8::count_zeros()` on
let zeros = percent.count_zeros();
// Usage #2: the compiler invokes deref to borrow `Target` to the function `print_int()`
print_int(&percent);
// Usage #5: we clone `Target` which returns an `u8`, which we can compare to `100`
if *percent == 100 {
println!("Maximum percent!");
}
// Usage #3: we use deref to reference -> move `Target` out of `Percent`
let inner: u8 = *percent;
}
fn print_int(int: &u8) {
println!("Int: {int}");
}
In this example, it's important to note that we do not want to implement DerefMut
for Percent
,
this would allow anyone to replace Percent::inner
, bypassing the constraints enforced on Percent::new()
.
If you're familiar with the type-state pattern, you might have come across a situation where you need to store the possible states somewhere, which can be done by using enum variants:
pub struct Npc<T> {
name: String,
max_health: i32,
health: i32,
state: T,
}
pub struct Idle;
pub struct Charging { time_remaining: f32 }
pub enum NpcEnum {
Idle(Npc<Idle>),
Charging(Npc<Charging>),
}
Imagine we have a variable of type NpcEnum
, then in a specific context, we want to access its field
name
, but we don't really care about state
.
A "brute-force" implementation could be done by defining a getter-method name(&self) -> &str
that matches on NpcEnum
to return such field:
impl NpcEnum {
pub fn name(&self) -> &str {
match self {
NpcEnum::Idle(this) => &this.name,
NpcEnum::Charging(this) => &this.name,
}
}
}
Although that is fine, you'll have a lot of maintenance to do whenever you add new states or new fields.
This can be solved by implementing the reference trait for NpcEnum
:
impl Deref for NpcEnum {
type Target = Npc<dyn std::any::Any>;
fn deref(&self) -> &Self::Target {
match self {
NpcEnum::Idle(this) => this,
NpcEnum::Charging(this) => this,
}
}
}
// Note: since `dyn Any` does not implement `Sized`,
// the implementation above requires relaxing the bounds on `T`:
pub struct Npc<T: ?Sized> {
name: String,
max_health: i32,
health: i32,
state: T,
}
With that implementation, we can access any fields/methods of Npc
as long as those don't require state
:
fn main() {
let npc = Npc {
name: Houtamelo,
max_health: 69, //nice
health: 7,
state: Idle,
};
let npc_enum = NpcEnum::Idle(npc);
let name = &npc_enum.name;
let max_health = npc_enum.max_health;
let health = npc_enum.health;
println!(
"Npc stats:\n\
\tName: {name}\n\
\tMax Health: {max_health}\n\
\tHealth: {health}\n"
);
}
This also means that adding more fields/states does not require any additional maintenance.
Deref
is a tool, and like any other tool, it doesn't fit all cases.
Given a type T
, implementing Deref
can cause problems if both T
and Target
implement the same trait or have a method with the same signature.
Consider the example:
fn main() {
let rc: Rc<Vec<i32>> = Rc::new(vec![5, 3]);
let clone = rc.clone();
}
Both Rc<Vec<i32>>
and Vec<i32>
implement Clone
. Since Rc<Vec<i32>>
implements
Deref<Target = i32>
, which implementation is being called here? If you check usage #1,
you can deduce that Rc::clone
is the one prioritized by the compiler, but that's still
implicit, and it may be confusing for other people reading your code (or your future self).
However, having a few overlapping implementations between T
and Target
is almost impossible
to dodge, in those cases, I recommend explicitly stating which implementation is being called:
fn main() {
let rc: Rc<Vec<i32>> = Rc::new(vec![5, 3]);
let cloned_rc = Rc::clone(&rc);
}
Check this section of the official book for more "words" of caution.
I hope you learned something, any feedback is appreciated.
Nice Lesson!