I was wondering if you've considered having an alternative, human readable encoding (either your own syntax or a JSON-based schema)?
I find it quite useful to be able to inspect data by eye and even hand-editing payloads occassionally, and having a standard syntax for doing so would be nice.
(More generally, it's a shame JSON doesn't support sum types "natively" and I think a human readable format with Typical's data model would be really cool).
It's a good question! The binary format is completely inscrutable to human eyes and is not designed for manual inspection/editing. However:
1) For Rust, the generated types implement the `Debug` trait, so they can be printed in a textual format that Rust programmers are accustomed to reading.
2) For JavaScript/TypeScript, deserialized messages are simple passive data objects that can be logged or inspected in the developer console.
So it's easy to log/inspect deserialized messages in a human-readable format, but there's currently no way to read/edit encoded messages directly. In the future, we may add an option to encode messages as JSON which would match the representation currently used for decoded messages in JavaScript/TypeScript, with sums being encoded as tagged unions.
> it's a shame JSON doesn't support sum types "natively"
You can describe it using JSON Schema though[1], using "oneOf" and "const". Though I prefer the more explicit way of using "oneOf" combined with "required" to select one of a number of keys[2].
It seems like the safety rules are buggy because some assumptions are missing?
The safety rules say that adding an asymmetric field is safe, and converting asymmetric to required is safe. If you do both steps, then this implies that adding a required field is safe. But it’s not. As you say, it’s not transitive.
But lack of transitivity also means that a pull request that converts a field from asymmetric to required is not safe, in general. You need to know the history of that field. If you know that the field has always been asymmetric (unlikely) or all the older binaries are gone, then it’s safe. A reviewer can’t determine this by reading a pull request alone.
Maybe waiting until old binaries are gone is what you mean by “a single change” but it seems like that should be made explicit?
No IDL that supports required fields can offer the transitivity property you're referring to.
Typical has no notion of commits or pull requests in your codebase. The only meaningful notion of a "change" from Typical's perspective is a deployment which updates the live version of the code.
When promoting an asymmetric field to required (for example), you need to be sure the asymmetric field has been rolled out first. If you were using any other IDL framework (like Protocol Buffers) and you wanted to promote an optional field to required, you'd be faced with the same situation: you first need to make sure that the code running in production is always setting that field before you do the promotion. Typical just helps more than other IDL frameworks by making it a compile-time guarantee in the meantime.
We should be more careful about how we use the overloaded word "change", so I'm grateful you pointed this out. Another comment also helped me realize how confusing the word "update" can be.
That's reasonable but it's a different notion of what a safe change is than I remember from using protobufs. I believe they just say adding or removing a required field isn't backward compatible.
The word "safe" seems worth clarifying. There are some changes that are always safe, like reordering fields, because they are also transitively safe.
Removing an optional field is safe as long as you remember not to reuse the index. Sometimes the next unused index is informally remembered using a counter in a comment.
Other changes, like converting asymmetric to required, seem like they're in a different category where it can be done safely but requires more caution. It's more like a database migration where you control all the code that accesses the database. There is a closed-world assumption.
> That's reasonable but it's a different notion of what a safe change is than I remember from using protobufs. I believe they just say adding or removing a required field isn't backward compatible.
Safe just means the old code and the new code can coexist (e.g., during a rollout), which requires compatibility in both directions. Not just backward compatibility.
This is true for Protocol Buffers as well, except they have no safe way to introduce or remove required fields. So the common wisdom there is to not use required fields at all.
I think our misunderstanding is really about use cases.
Sometimes Protocol Buffers are used to write log files, and the log files are stored and never migrated. To read the oldest logs, you need backward compatibility all the way back to the first production code that was released. This means transitive safety is needed and the changes you can make to the schema, which is used as a file format, are pretty limited.
This isn't just a limitation of the Protocol Buffer format. Safety rules are different when you do long-term persistence. If Typical were used that way, you could only trust safety rules that are transitive. Asymmetric fields could be added, but the fallbacks never go away.
(Also, a rollback doesn't get rid of any logs that were generated, so it's not a full undo. As you say, both forward and backward compatibility are needed.)
Serialization isn't just used for network calls, and even when it is, sometimes you don't control when clients upgrade, such as when the clients get deployed by different companies, or as part of a mobile app. So it seems worth clarifying the use cases you have in mind when making safety claims.
I think you're right, and now I understand why the rules seemed buggy to you but not to me. You're considering persisted messages that need to be compatible with many versions of the schema, whereas the discussion and rules are formulated in the context of RPC messages between services which only need to be compatible with at most three versions of the schema: the version that generated the message, the version before that, and the version after. The README could do better to clarify that.
In the persisted messages scenario, there is one change to the rules: you can never introduce a required field (since old messages might not have it). Not even asymmetric fields can be promoted to required in that scenario.
To expand on this, a way to think about it is that there are some changes that are always safe and others that depend on what data is still out there (or that’s still being generated) that you want the code to be able to read.
“What writers are out there” isn’t a property of the code alone, though maybe you could use the code to keep track of what you intended. The releases deployed to production determine which writers exist, and they keep running until stopped and perhaps upgraded.
In some cases a serialization schema might be shared in a common library among multiple applications, each with its own release schedule, making it hard to find out which writers are still out there.
It’s much easier when the serialization is only used in services where you control when releases happen and when they’re started and stopped.
Thus, asymmetric fields in choices behave like optional fields for writers and like required fields for readers—the opposite of their behavior in structs.
So if you have a schema change which adds an asymmetric field to both a struct and a choice, it seems both writers and readers needs to be updated in order to successfully transmit to each other?
If you add an asymmetric field to a struct, writers need to be updated to set the field for the code to compile.
If you also add an asymmetric field to a choice, readers need to be updated to be able to handle the new case for the code to compile.
You can do both in the same change. The new code can be deployed to the writers and readers in any order. Messages generated from the old code can be read by the new code and vice versa, so it's fine for both versions of the code to coexist during the rollout.
After that first change is rolled out, you can promote the new fields to required. This change can also be deployed to writers and readers in any order. Since writers are already setting the new field in the struct, it's fine for readers to start relying on it. And since readers can already handle the new case in the choice, it's fine for writers to start using it.
Ah, I was thinking in terms of the actual messages, not compilation. So it should read "Fields are required for compilation by default"?
This leads me to versioning. Imagine you have some old code which won't be upgraded, say an embedded system. Either you don't promote to "required", or you do versioning.
Given the lack of any mention of versioning, I take it that's to be deal with externally? Ie as a separate schema, and detected before handing the rest of the data to the generated code?
Do you think you could ever generate types for go? The protobuf implementation of oneof in go is pretty rough to look at, and not fun to type over and over.
Yes! We have comprehensive integration tests that run in the browser to ensure the generated code only uses browser-compatible APIs. Also, the generated code never uses reflection or dynamic code evaluation, so it works in Content Security Policy-restricted environments.