Modern ID Spec

An adaptable, human-friendly, web-safe, unique ID spec for modern applications.

Guiding Principles

Short
Human friendly
URL-safe
Developer experience
Realistically unique

Why Not UUIDs

Forces you into using the "worst-possible" case ID for all your tables.

You only need high-entropy IDs for sensitive data or large-volume tables. (Most or your tables are low volume)
Hard to copy paste (not human friendly)
Impossible to read out (not human friendly)
Bad for debugging (currently still done by humans)
Make URLs ugly af
Complete overkill of most usecases
Saving 16 bytes per row is not worth all of the grossness.
Really designed for distributed systems, not web apps.

Why Not Sequential Integers

always guessable
bad for scaling databases horizontally
can't be created client-side
can't be used for high-volume tables
it's not 2000s anymore

The Modern ID

Traditionally we've chosen between sequential Integers and (G)UUIDs for identifiers.

All credit goes to Stripe for pioneering this ID style, but it hasn't gained enough momentum, so here I am.

Modern IDs are:

In the {pf}_{suffix} format (see below)
Contain a short prefix that maps to a table/entity (yes this is stored in the database, see Pros/Cons)
Forgoes marginal disk-space savings for developer experience, user friendliness, and enhanced product design patterns.
Can still guarantee uniqness for large tables (see below)

Format

{prefix}_{suffix}

Prefix: 1-2 lowercase alphabet characters, maps to the table/entity
Suffix: 4-32 alphanumeric characters

Format: [a-z]{1,2}\_[a-zA-Z0-9]{4,32}

Example: m_CWZpkWfq or t_2rw2FzZB or u_9d2F

A max suffix length of 32 is chosen because at that point we can switch to using hyphen-less UUIDs for the suffix.

Configuration & Rules

We will have a mapping of table -> prefix and suffix length
Each table should have a suffix length CHOSEN for it's volume and sensitivity.
Low-volume tables can go as low as 4 digit suffixes,
High high-volume/sensitive tables should use 32digit suffixes (essentially prefixed UUIDs)
High while high-volume/sensitive tables can use 32digit suffixes (essentially UUIDs)
Table prefixes should never change.
Table suffix size can grow independently in size as the table grows, increasing size gives you a whole new set of IDs since sets never conflict (e.g. [A-Z]{4} !== [A-Z]{5}).
Since uniqness is not guaranteed, retries should be handled in either the database/server/client depending on your use-case.
When uniquness is required (one-time event firing), use 32 digit suffixes.
Easy to confuse characters like "O" should be ommited from suffix generators (see NanoID)

Pros

Beautiful URLs out of the box (e.g. https://app.com/m_CWZpkWfq)
ID sizes are chosen for each table and hence can be kept as short as possible (human friendly)
Looking at an ID, tells you what you are looking at and where to find it (Useful for apps and humans alike.)
Web app routing paths no longer need extra path segments (e.g. https://app.com/workspace/w_tuy5 -> https://app.com/w_tuy5 )
Web apps can support mobile-style routing by pushing another ID onto the url while maintaining the stack (e.g. https://app.com/w_tuy5 -> https://app.com/w_tuy5/m_rDyI0yXjt)
Growing IDs over time gives you a bigger set of possible IDs while keeping them shorter. All possible values of 4 suffix + 5 suffix + 6 suffix. (e.g. t_abcd, t_abcde, t_abcdef)
Can still guarentee uniquness with 32 digit suffix (fallback to UUID generators for suffix internally).

Proposed Typescript Utility

// A single Modern ID definition
type Mapping<T> = {
  type: T;
  prefix: string;
  size?: number = 8; // A good middle ground
};

// Type-safe configuration of your tables -> prefixes with optional suffix sizes
export function configure<T extends string>(
  mappings: Mapping<T>[]
): {
  newID: (type: T) => string; // generates a new ID for the given entity type
  isID: (id: ID) => boolean; // checks if the given string is a valid ModernID format
  toType: (id: ID) => T; // extracts the entity type from the ID, (useful for using in app logic)
  toPrefix: (type: T) => string; // extracts the prefix from the entity type
};

// Example usage
type AppEntityType = "workspace" | "message" | "event";

// Type-safe validation that all entities are mapped uniqly
const { newID, isID, toType, toPrefix } = configure<AppEntityType>([
  { prefix: "w", type: "workspace", size: 4 }, // low volume
  { prefix: "m", type: "message", size: 8 }, // mid volume
  { prefix: "e", type: "event", size: 32 }, // high volume, must be uniq always
]);

const userID = newID("user"); // w_tuy5
const messageID = newID("message"); // m_EtjrjVz6
const eventID = newID("event"); // e_QQBjNml7tuR7U8vaJTucC6LkPTsg8bzx

Proposed Postgres Function

Could optionally add this into the database layer for easy ID generation. (not required)

CREATE FUNCTION modern_id(p_prefix TEXT, p_length INT)
RETURNS TEXT AS $$
...
$$ LANGUAGE plpgsql;

-- Example usage
CREATE TABLE workspace ( id TEXT PRIMARY KEY DEFAULT modern_id('w', 4), name TEXT);
CREATE TABLE messages ( id TEXT PRIMARY KEY DEFAULT modern_id('m', 16), name TEXT);
CREATE TABLE events ( id TEXT PRIMARY KEY DEFAULT modern_id('e', 32), name TEXT);

Handling Conflicts

To choose the correct suffix size you can use the great collision calculator by alex7kom: https://alex7kom.github.io/nano-nanoid-cc

Using the above, most of your tables can get away with 8 digit suffixes.

Dream case would be starting them all at 4 digits and then automating the bumping of suffix sizes.

Multiple ways to handle the eventual conflics for low-volume tables, depending on how short you want to keep your IDs.

Database layer – Use UPSERT for all creates and retry with new ID on conflict
Server layer – Catch ID conflicts and retry up to 3 times with a new ID before failing to client.
Client layer – Catch ID conflicts and retry up to 3 times with a new ID before failing.

Regardless, this is a solved problem and not that hard.

Closing Comments

Was debating calling them HumanIDs but that already seems to be a thing. Thoughts?

If this picks up steam will publish typescript packages for this.

Would love to hear your thoughts and feedback on this spec.

If interested, follow my journey as I build something big: https://sebastiankade.substack.com/

sebastiankade/modern-id-spec.md

Modern ID Spec

Guiding Principles

Why Not UUIDs

Why Not Sequential Integers

The Modern ID

Format

Configuration & Rules

Pros

Proposed Typescript Utility

Proposed Postgres Function

Handling Conflicts

Closing Comments

szalapski commented Aug 22, 2024 •

edited

Loading

shadowcat-mst commented Aug 22, 2024

sebastiankade commented Aug 22, 2024

sebastiankade/modern-id-spec.md

Modern ID Spec

Guiding Principles

Why Not UUIDs

Why Not Sequential Integers

The Modern ID

Format

Configuration & Rules

Pros

Proposed Typescript Utility

Proposed Postgres Function

Handling Conflicts

Closing Comments

szalapski commented Aug 22, 2024 • edited Loading

shadowcat-mst commented Aug 22, 2024

sebastiankade commented Aug 22, 2024

szalapski commented Aug 22, 2024 •

edited

Loading