An adaptable, human-friendly, web-safe, unique ID spec for modern applications.
- Short
- Human friendly
- URL-safe
- Developer experience
- Realistically unique
Forces you into using the "worst-possible" case ID for all your tables.
- You only need high-entropy IDs for sensitive data or large-volume tables. (Most or your tables are low volume)
- Hard to copy paste (not human friendly)
- Impossible to read out (not human friendly)
- Bad for debugging (currently still done by humans)
- Make URLs ugly af
- Complete overkill of most usecases
- Saving 16 bytes per row is not worth all of the grossness.
- Really designed for distributed systems, not web apps.
- always guessable
- bad for scaling databases horizontally
- can't be created client-side
- can't be used for high-volume tables
- it's not 2000s anymore
Traditionally we've chosen between sequential Integers and (G)UUIDs for identifiers.
All credit goes to Stripe for pioneering this ID style, but it hasn't gained enough momentum, so here I am.
Modern IDs are:
- In the
{pf}_{suffix}
format (see below) - Contain a short prefix that maps to a table/entity (yes this is stored in the database, see Pros/Cons)
- Forgoes marginal disk-space savings for developer experience, user friendliness, and enhanced product design patterns.
- Can still guarantee uniqness for large tables (see below)
{prefix}_{suffix}
- Prefix: 1-2 lowercase alphabet characters, maps to the table/entity
- Suffix: 4-32 alphanumeric characters
Format: [a-z]{1,2}\_[a-zA-Z0-9]{4,32}
Example: m_CWZpkWfq
or t_2rw2FzZB
or u_9d2F
A max suffix length of 32 is chosen because at that point we can switch to using hyphen-less UUIDs for the suffix.
- We will have a mapping of table -> prefix and suffix length
- Each table should have a suffix length CHOSEN for it's volume and sensitivity.
- Low-volume tables can go as low as 4 digit suffixes,
- High high-volume/sensitive tables should use 32digit suffixes (essentially prefixed UUIDs)
- High while high-volume/sensitive tables can use 32digit suffixes (essentially UUIDs)
- Table prefixes should never change.
- Table suffix size can grow independently in size as the table grows, increasing size gives you a whole new set of IDs since sets never conflict (e.g.
[A-Z]{4} !== [A-Z]{5}
). - Since uniqness is not guaranteed, retries should be handled in either the database/server/client depending on your use-case.
- When uniquness is required (one-time event firing), use 32 digit suffixes.
- Easy to confuse characters like "O" should be ommited from suffix generators (see NanoID)
- Beautiful URLs out of the box (e.g.
https://app.com/m_CWZpkWfq
) - ID sizes are chosen for each table and hence can be kept as short as possible (human friendly)
- Looking at an ID, tells you what you are looking at and where to find it (Useful for apps and humans alike.)
- Web app routing paths no longer need extra path segments (e.g.
https://app.com/workspace/w_tuy5
->https://app.com/w_tuy5
) - Web apps can support mobile-style routing by pushing another ID onto the url while maintaining the stack (e.g.
https://app.com/w_tuy5
->https://app.com/w_tuy5/m_rDyI0yXjt
) - Growing IDs over time gives you a bigger set of possible IDs while keeping them shorter. All possible values of 4 suffix + 5 suffix + 6 suffix. (e.g.
t_abcd
,t_abcde
,t_abcdef
) - Can still guarentee uniquness with 32 digit suffix (fallback to UUID generators for suffix internally).
// A single Modern ID definition
type Mapping<T> = {
type: T;
prefix: string;
size?: number = 8; // A good middle ground
};
// Type-safe configuration of your tables -> prefixes with optional suffix sizes
export function configure<T extends string>(
mappings: Mapping<T>[]
): {
newID: (type: T) => string; // generates a new ID for the given entity type
isID: (id: ID) => boolean; // checks if the given string is a valid ModernID format
toType: (id: ID) => T; // extracts the entity type from the ID, (useful for using in app logic)
toPrefix: (type: T) => string; // extracts the prefix from the entity type
};
// Example usage
type AppEntityType = "workspace" | "message" | "event";
// Type-safe validation that all entities are mapped uniqly
const { newID, isID, toType, toPrefix } = configure<AppEntityType>([
{ prefix: "w", type: "workspace", size: 4 }, // low volume
{ prefix: "m", type: "message", size: 8 }, // mid volume
{ prefix: "e", type: "event", size: 32 }, // high volume, must be uniq always
]);
const userID = newID("user"); // w_tuy5
const messageID = newID("message"); // m_EtjrjVz6
const eventID = newID("event"); // e_QQBjNml7tuR7U8vaJTucC6LkPTsg8bzx
Could optionally add this into the database layer for easy ID generation. (not required)
CREATE FUNCTION modern_id(p_prefix TEXT, p_length INT)
RETURNS TEXT AS $$
...
$$ LANGUAGE plpgsql;
-- Example usage
CREATE TABLE workspace ( id TEXT PRIMARY KEY DEFAULT modern_id('w', 4), name TEXT);
CREATE TABLE messages ( id TEXT PRIMARY KEY DEFAULT modern_id('m', 16), name TEXT);
CREATE TABLE events ( id TEXT PRIMARY KEY DEFAULT modern_id('e', 32), name TEXT);
To choose the correct suffix size you can use the great collision calculator by alex7kom: https://alex7kom.github.io/nano-nanoid-cc
Using the above, most of your tables can get away with 8 digit suffixes.
Dream case would be starting them all at 4 digits and then automating the bumping of suffix sizes.
Multiple ways to handle the eventual conflics for low-volume tables, depending on how short you want to keep your IDs.
- Database layer – Use UPSERT for all creates and retry with new ID on conflict
- Server layer – Catch ID conflicts and retry up to 3 times with a new ID before failing to client.
- Client layer – Catch ID conflicts and retry up to 3 times with a new ID before failing.
Regardless, this is a solved problem and not that hard.
Was debating calling them HumanIDs but that already seems to be a thing. Thoughts?
If this picks up steam will publish typescript packages for this.
Would love to hear your thoughts and feedback on this spec.
If interested, follow my journey as I build something big: https://sebastiankade.substack.com/
Why not omit the underscore?
Also, maybe an alternative version/setting that uses only [0-9], or only [0-9a-z], or only [a-z], in the suffix?