In Rust today, we have distinct primitive types for all integers.
u8
, u16
, i64
, usize
, etc.
This works fine, and serves well as a way of handling numbers. Every programming langauge under the sun does this, and it works.
But we could do better.
With the advent of const generics in Rust 1.51, we could add
a unifying integer type that handles all of these at once: uint
(and int
for signed types).
In this theoretical version of the lanugage, u8
would simply be a type alias to uint::<8>
,
and same for everything else.
usize
would alias to uint::<mem::POINTER_SIZE>
, but there's a problem with this.
This would mean that depending on your system, usize
could be equal to uint::<64>
,
which adds a very prominent footgun for portability into the language.
Because of this, I propose new syntax - a "distinct" type alias. People familiar with C3 may know about this.1
Distinct type aliases are, following the name, distinct from the type they alias to.
They inherit all of the methods and functionality, but they are not the same type.
This also allows you to impl
on a distinct type alias.
I can imagine the syntax for this going something like this:
distinct type usize = uint::<mem::POINTER_SIZE>;
You would also be able to as
cast between a distinct type alias and its underlying type if need be -
this doesn't break anything with integers, as you can already as
cast between them at will.
Along with this, since you can implement methods on specific values of generic types,
things like from_le_bytes
don't need to go away - it could in fact be implemented for all uint::<N> where N % 8 == 0
,
but would likely be easiest to impl uint::<32>
and such.
Let's take a step back. All this seems pretty neat, but what's the benefit here? It's a big jump in complexity in the language, and might not be intuitive for some.
Where the power of this lies is twofold.
On the one hand, we can now implement methods for every integer type at the same time. This reduces the amount of repeated code, which is always a good thing to have.
On the other, this allows non-power of two integer types.
These types are supported by LLVM, which means there wouldn't need to be too much work to get them in on the lower level of Rust - but there would have to be some careful handling on higher levels.
Imagine a struct like this2:
#[repr(packed(8))]
pub struct LightInfo {
pub is_light_info: bool,
pub is_lamp_color: bool,
_padding: uint::<2>,
pub brightness: uint::<4>
}
This is something that isn't possible natively in Rust. There's an equivalent form to this in C++:
typedef struct LightInfo {
bool isLightInfo: 1;
bool isLampColor: 1;
unsigned: 2; // padding
unsigned int brightness: 4;
} LightInfo;
This would allow for new avenues, e.g. bitstruct enums.
Below is an example, of an enum representing an instruction from the Overture ISA, from the game Turing Complete3:
#[repr(uint(3))]
pub enum Register { R0, R1, R2, R3, R4, R5, IO }
#[repr(uint(3))]
pub enum AluInstruction { OR, NAND, NOR, AND, ADD, SUB }
#[repr(uint(3))]
pub enum Condition { Never, EqZero, LtZero, LeqZero, Always, NeqZero, GeqZero, GtZero }
#[repr(packed(8))]
pub enum Instruction {
Immediate { value: uint::<6> } = 0b00u2,
Calculate { alu_code: AluInstruction } = 0b00000u5,
Copy { source: Register, destination: Register } = 0b01u2,
Branch { condition: Condition } = 0b11000u5
}
As far as I'm aware, no other language has done this yet.
There's danger in this, though - Imagine something like this:
#[repr(packed(65))]
pub struct Adversarial {
pub offset: bool,
pub misaligned: u64
}
Not only would misaligned
be not byte aligned, but it wouldn't even be bit aligned.
There's a real danger in having types cross byte boundaries,
so I can imagine this would have to be explicitly disallowed by the compiler.
This makes generic structs with integers kind of annoying, as this:
pub struct Vector2I<const N: usize> { pub x: int::<N>, pub y: int::<N> }
cannot be #[repr(packed)]
, as there's a possibility of the values flowing off a byte -
so things like Vector2I<4>
would have to take up at minimum 2 bytes.
Realizing now that the way I said it, distinct type aliases are equivalent to a subset of OOP inheritance :L