They said it's available in the API too, in the blog post.
EDIT:
> GPT‑5.3 Instant is available starting today to all users in ChatGPT, as well as to developers in the API as ‘gpt-5.3-chat-latest.’ Updates to Thinking and Pro will follow soon. GPT‑5.2 Instant will remain available for three months for paid users in the model picker under the Legacy Models section, after which it will be retired on June 3, 2026.
Congrats guys! Curious how the read write splitting is reliable in practice due to replication lag. Do you need to run the underlying cluster with synchronous replication?
The way we solved it is by checking the lsn on the primary, and then waiting for the replica to catch up to that lsn before doing reads on the replica in various scenarios.
Not really, replication lag is generally an accepted trade-off. Sync replication is rarely worth it, since you take a 30% performance hit on commits and add more single points of failure.
We will add some replication lag-based routing soon. It will prioritize replicas with the lowest lag to maximize the chance of the query succeeding and remove replicas from the load balancer entirely if they have fallen far behind. Incidentally, removing query load helps them catch up, so this could be used as a "self-healing" mechanism.
It sounds like this is one of the few places that might be a leaky abstraction in that queries _might_ fail and the failure might effectively be silent?
It can be silent, but usually it's loud and confusing because people do something like this (Rails example):
user = User.create(email: "test@test.com")
SendWelcomeEmail.perform_later(user.id)
And the job code fetches the row like so:
user = User.find(id)
This blows up because `find` throws an error if the record isn't there. Job queues typically use replicas for reads. This is a common gotcha: code that runs async expects the data to be there after creation.
There can be others, of course, especially in fintech where you have an atomic ledger, but people are usually pretty conscious about this and send those type of queries to the primary.
In general though, I completely agree, this is leaky and an unsolved problem. You can have performance or accuracy, but not both, and most solutions skew towards performance and make applications handle the lack of accuracy.
Why is that? Personally I appreciated the throwback look and it probably accomplished its goal of being memorable. Turbopuffer is another notable one seemingly leaning into this flavor of marketing
Tight integration with the typescript tool chain has been great for us with edgeql and is about an order of magnitude less error prone than ORMs I've interacted with. Gel is a winning formula especially in the typescript world.
We used TipTap to great effect in an old iteration of our product at credal.ai. It helped us create nuanced text tagging behavior without too much time investment. Would happily recommend it.
The AI Chief of Staff had a few layers. The first was data integration of both productivity data (slack, notion etc) and "big data" lakes/warehouses. The former tells you what is getting done at a human level and the latter has the potential to tell you whether and how it is working. The second layer was modeling of your business strategy and including dependencies between concepts like projects and
teams, which allows us to back out things like stakeholders and early warning recipients for any given progress or problems. The third was a presentation layer allowing humans to get a birds' eye view of what's happening including generating artifacts like meetings decks.
Ultimately this 1) wasn't successfully solving an urgent enough problem for most businesses and 2) was too difficult to adopt.
LLMs do break open opportunities in this space so I expect to see some more versions of this, perhaps on top of the Credal API!
Thank you! We haven't gone down the Foundry route yet. We do have some smaller scale apps and companies using Credal either as their AI API or chat platform respectively - would be interested to hear a bit about your use case and see if it's a match?
There was definitely a browsing component, at least originally. I remember seeing part of the prompt leaked and it had something like "internet browsing: disabled" in it
reply