Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Adding type safety to object IDs in TypeScript (kravchyk.com)
144 points by mckravchyk on Jan 30, 2024 | hide | past | favorite | 58 comments


This is pretty close to type branding (newtype wrapping for the Haskell-inclined), though using template literal types is pretty novel. Normal brands look something like this:

    type Brand<BaseType, Brand> = BaseType & { readonly __brand__: Brand };
    type FooId = Brand<string, 'FooId'>;
    function fooBar(asdf: FooId | 'foobar'): void { }
fooBar will only accept the literal string 'foobar' or a true FooId, but not any arbitrary string. FooId would then come from a function that validates strings as FooIds, or some other part of the app that is an authoritative source for them. Brands extend their BaseType so they can be used anywhere their BaseType is used, but not the inverse


If you want to make this easier to keep private between encapsulation boundaries the additional suggestion is make sure the Brand type extends symbol:

    type Brand<BaseType, Brand extends symbol> = BaseType & { readonly __brand__: Brand };
    const FooIdBrand = Symbol('FooId');
    type FooId = Brand<string, typeof FooIdBrand>;
    function fooBar(asdf: FooId | 'foobar'): void { }
Using a private shared symbol your authoritative validation/sources can share your brand symbol and no one else can create one without using your validation. Private symbol brands in this way become the closest Typescript gets to "nominal types".


Unfortunately this doesn’t work, at least not from a type safety perspective, because even without access to the symbol, nothing stops anyone from doing `let myFooId = 'foo' as any as FooId;`. You could detect this at runtime, but type safety is compile time.


Sure, the TS type system is not sound but the idea is not to stop "bad guys", it's to help you realize you are doing something unintended.


This is very true, but "helping you realize you are doing something unintended" works just as well with a string as with a symbol.


Agreed, for instance in our codebase we just make all type assertions a lint error demanding a justification, as well as flat out banning the any type. But anyone is free to write shoddy TypeScript.


Right, hence "closest to" in my description. Typescript's role ends at compile time and it can't/won't stop bad actors at runtime. Typescript tries to make it easier for good actors to do the right thing more of the time.

That said, the other benefit to using private symbols like this is that they are also easy to enforce at runtime, because symbol visibility is enforced at runtime (you can't create the same signal by hand somewhere else). It can be as easy as something like:

    console.assert(id.__brand__ === FooIdBrand)
(That still won't stop the determined hacker in the console dev tools, if they can see a symbol they can create a reference to it, defense in depth will always be a thing.)


The easiest way I know of is

    declare const isMyID: unique symbol;
    export type MyID = string & { [isMyID]: true };


nice!


It is also convenient to use a unique symbol for the brand (declare const brand: unique symbol). Then we can combine multiple brands in the same type and if we don't export that symbol type, we simply dont have a way to access the brand property at runtime.


The meaning is different though. Brands convey intent, UserId brand would allow only other UserId brands, but with string literal types "any" type that matches 'user_${string}' will do


In my experience branded types are relatively more fragile than normal types though. IIRC they badly behaved with infer types in particular, and it was quite hard to work around. This solution seems more versatile. (Of course, I want to see a built-in branded type support in TS as well.)


This solution only works with strings, whereas branded types can be used with numbers as well, or any kind of object that you want to add stricter types to without modifying the runtime value.

I haven't observed any issues with branded types and infer—is there documentation somewhere about the problem?


As others pointed out, TypeScript sometimes reasons `string & object` or similar as an impossible type and can turn it into `never` at any time. I don't exactly recall whether `infer` triggered that or it was a separate issue, but that was a major problem in my experience.


FWIW I’ve been using branded types for years and never had this issue.


I prefer the Brand solution, it works well for existing ID sets that you can't easily migrate to have an actual string prefix


Also for values that aren't strings at all.


If you want a type-prefixed UUIDv7 type, I can wholeheartedly recommend TypeID-JS: https://github.com/jetpack-io/typeid-js

Also available for a whole bunch of other languages: https://github.com/jetpack-io/typeid

UUIDv7 is UUIDv4-compatible (i.e. you can put a v7 UUID anywhere a v4 UUID would go, like in Postgres's UUID datatype) and is time-series sortable, so you don't lose that nice lil' benefit of auto-incrementing IDs.

And if you use something like TypeORM to define your entities, you can use a Transformer to save as plain UUIDv7 in the DB (so you can use UUID datatypes, not strings), but deal with them as type-prefixed strings everywhere else:

    export const TYPEID_USER = 'user';

    export type UserTypeID = TypeID<'user'>;
    
    export type UserTypeString = `user_${string}`;
    
    export class UserIdTransformer implements ValueTransformer {
      from(uuid: string): UserTypeID {
        return TypeID.fromUUID(TYPEID_USER, uuid);
      }
    
      to(tid: UserTypeID): string {
        assert.equal(
          tid.getType(),
          TYPEID_USER,
          `Invalid user ID: '${tid.toString()}'.`,
        );
    
        return tid.toUUID();
      }
    }
    
    @Entity()
    export class User {
      @PrimaryColumn({
        type: 'uuid',
        primaryKeyConstraintName: 'user_pkey',
        transformer: new UserIdTransformer(),
      })
      id: UserTypeID;
    
      @BeforeInsert()
      createNewPrimaryKey() {
        this.id = typeid(TYPEID_USER);
      }
    }


Its bit sad that startsWith doesn't narrow the type, making this pattern slightly less convenient. The GH issue: https://github.com/microsoft/TypeScript/issues/46958


I wish that TS had better type narrowing for the JS standard library, though there's a lot of constraints and design limitations that make it impractical. I ran into a similar issue with the some() method on Array not narrowing types a while back [1]; that issue links to the same sort of issue with filter(), as well as issues where the TS team has discussed what they can and can't do in control flow analysis.

[1] https://github.com/microsoft/TypeScript/issues/40844


you can improve it a bit with the library ts-reset

https://github.com/total-typescript/ts-reset


And also can declare your own wrappers to at least achieve it for your own codebase.


What is the problem with the workaround suggested in the last comment there?


Define a type guard using a "type predicate"[1].

For example:

    type UserId = `user_${string}`;
    type GroupId = `group_${string}`;

    const addUserId = (id: UserId) => {
      // do something
    }

    const processId = (id: string) => {
      if (id.startsWith('user_') {
        // type error here:
        addUserId(id);
      } else if (otherCondition)
        // do other things
      }
    }
Instead you define a function:

    const isUserId = (some: string): some is UserId => some.startsWith('user_');
Now you can use it as follows:

    const processId = (id: string) => {
      if (isUserId(id)) {
        // no more error:
        addUserId(id);
      } else if (otherCondition)
        // do other things
      }
    }
[1]: https://www.typescriptlang.org/docs/handbook/2/narrowing.htm...


You shouldn't need to write this kind of thing manually for every such type.


    function is<T extends string>(value: string, prefix: T): value is `${typeof prefix}_${string}` {
      return value.startsWith(`${prefix}_`)
    }

You can now do `is(id, 'user')`.

If you do that often you probably want to create separate functions, e.g.:

    function isFactory<T extends string>(prefix: T) {
      return (value: string) => is(value, prefix)
    }

    const isUser = isFactory('user')
    const isOrder = isFactory('order')
Not too bad.


Stripe does this (prefix ids with object type). Its smart. Makes it much easier to work with the ids.


Agreed, I can look at any ID and know what type it is. Even with well named fields it helps a ton in the docs “oh, they pass an account in or a payment ID”. Even in my own DB if I reference the stripe ID I know what it is without even having to look at the column name.


I’ve done this in multiple languages. I dislike libraries that return string ids.

The proliferation of string identifiers is a pet peeve of mine. It’s what I call “stringly typed” code (not my coinage but I use it all the time).


"Stringly typed", the way I've heard it, is a valid criticism when people replace type safety with magic strings which may or may not be checked at runtime but certainly not at compile time.

However, that's not the case when it comes to Typescript, because literal and union string types are actually checked at compile time. So what is the problem?


If everything is a string, you can accidentally use a UserID as a PostID.


Exactly. Especially easy if the variable name is “id”. For API methods which take multiple ids, the order is easy to mix up - though in TS that is conventionally handled with options objects.


But this is exactly what the article is about? I don't know what you mean by "option object" but it doesn't sound any more conventional than union types to me.


What I meant by option objects is a dictionary object holding all parameters.

doStuff({userId: foo, itemId: bar});

This allows the order of key/val pairs to move around, making it more robust to mistakes than doStuff: (string,string) => void.


What’s your aversion to string ids?

Personally I love them and prefer them in all cases. They aren’t enumerable, never get confused for “is this an array or a map by ID” in PHP, can be used safely as keys without some languages (looking at you PHP) returning an array instead of an object (assoc. array) when converting to JSON, don’t need to be converted back to a number after passing through something like a URL/get param, are less likely to have overlap with keys from other types (even more so if you prefix the key with a type identifier), no need to know that last ID used in the DB so you can build your key in app code instead of the DB, and I’m sure I have more things I like about them.

I understand auto-inc can have some performance gains in the DB but I’ve never needed the gains more than I wanted sane (in my mind) ids.

For the longest time I used UUID (v4) and I still do sometimes but lately I’ve liked KSUID since they are sortable by create date (great for things like DynamoDB IMHO).


Identifiers belong to a domain. A userid belongs to a member of a set of Users, a groupid to Groups, and so on. You are happy to have a User object with a distinct type, and wouldn’t try to have a superset UserOrGroup class.

A string belongs to the domain of “all strings” and so the type system and compiler cannot catch something like “authorizeUser(id,…) where the id is in fact “” or a groupid or “null” or “undefined”.

Lots of code does what you describe. But I prefer to use the type system wherever I can.


I love this approach but I augment it with Zod branded types. IDs have a known start to their identifier and anything coming in and out of the database is verified match a schema


Type-prefixed IDs are the way to go. For completeness it's worth noting that the first example using the `string | 'currentNode'` type can be slightly improved in cases where you _do_ want autocomplete for known-good values but are still OK with accepting arbitrary string values:

  type Target = 'currentNode' | (string & {});
  const targets: Target[] = [
    'currentNode', // you get autocomplete hints for this!
    'somethingElse', // no autocomplete here, but it typechecks
  ];


Is (string & {}) semantically intuitive or is it kind of a hack? I don’t understand what it’s supposed to mean.


It's a useful hack. In JavaScript, there is no value that's both a string and an object. At runtime, it will just be a string. You can use it like a string and it will type-check, because it's a string plus some extra compile-time baggage, sort of like you subclassed the string type. ('&' is a subtype operation.)

When converting something to this type, it will fail unless you cast it, but it's a compile-time cast. At runtime, there's no conversion.

This is essentially "lying" to the type checker in order to extend it.


I've done something similar for URLs (stops you mixing up whole URLs, substrings of URLs and regular strings), relative vs absolute time (easy to mix these up when there's several of these around and you're subtracting/adding to create new times) and color spaces (stops you mixing up tuples of RGB and HSL values). Feels very worthwhile for object IDs as well as there's always other variables around you could get them mixed up with.


I like using "resource names" defined by Google's AIP (https://google.aip.dev/122). For example, the name "users/1/projects/42" is a nested project resource of a user "users/1". TypeScript type could be "users/${number}".


Because they use slashes, those are kinda annoying to use in web UIs.

Say, /view/users/jdoe/foo -- is that foo a resource, or a URL my web framework can use to e.g. fetch data & components SPA style?

With a flat /view/user_jdoe/foo, you don't have that source of confusion.


Not sure if it’s a valid point, but what I would like to have - kind of a regex or a template for strings or numbers. Otherwise, it’s still just a sting or a specific value. It’s not like you are free to update backend to prefix ids to your liking. Most of the time you have to work with set schemas.


API models like Smithy can do that.


I did not know about type branding, but it would also be possible to just use casting if you don't want the prefix at runtime:

  type UserId = `usr_${string}`
  const user = { id: 'bgy5D4eL' as unknown as UserId }
Casting would just need to be applied wherever the object is generated, retrieving from the database requires casting either way. It could be a footgun though, if someone working on the codebase thought that the prefix is actually there and decided to use it for a runtime check.

I wanted to add this to the article, but decided not to, since I think having the prefix at runtime is just as useful - wherever the ID occurs, i.e. in error log the type of the object is always clear. But that or type branding is something that is much easier to apply in an existing system indeed.

Btw. I submitted this on Monday 6 AM EST and now it is visible as submitted 19h ago? I totally did not expect to make it far, let alone /front when it initially did not get any upovtes. I'm curious how it works :)



Has anyone tried using custom types for ids in java?

I considered doing it on a recent project, but it doesn't seem very common so I was reluctant to introduce it.


I've done it in Kotlin, and I suspect that modern Java should be quite amenable to it with records.

It was really nice in my Kotlin project because we were dealing with legacy data structures with very confusing names—being able to guarantee that a UserID doesn't accidentally get passed where a UserDataID was expected helped prevent a lot of the bugs that plagued the legacy apps.


That's great to hear. We did the same, but in C#, using its records. The codebase didn't exactly suffer from errors from ID misuse (all of which were the same type beforehand), but it's great for future-proofing as well.

Added benefit, as always when leaning into the type system more, is a reduction in the number of unit tests required. The need to test that `update_user(group_id)` fails (because a non-user ID was passed) simply disappears.


In Scala it worked great using AnyVal wrappers around primitive types. It’s something I miss in typescript, where type aliases are more for documentation purposes on id types but don’t add much type safety. I think they trick is the type needs value semantics, which records should help with.


I’ve done it. There’s little cost, but it does feel unidiomatic in Java for some reason.

Not sure whether I’d do it again or not. It’s hard to say how much benefit that I’m getting out of them.


I wonder if a similar (but maybe more bloated) implementation using interfaces (and probably generics too?) will work in this case.


Something like this, maybe?

https://www.typescriptlang.org/play?#code/C4TwDgpgBAkgIlAvFA...

Alternatively, with generics: https://www.typescriptlang.org/play?#code/JYOwLgpgTgZghgYwgA...

I don't think these are better, to be honest. The string types suffice and are easier to exchange with servers and other APIs.


Reminds me of Slack IDs:

- Channel IDs always start with C

- User IDs always start with U


Stripe also follows similar conventions - very convenient.


Nice!


Typescript is a work of art created on a barren canvas. Kudos to the team behind this language.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: