The problem with all of these is that SOTA models keep changing. I thought about...

Ancapistani · 2025-05-20T20:18:57 1747772337

I wonder if there's an opportunity here to abstract away these subscription costs and offer a consistent interface and experience?

For example - what if someone were to start a company around a fork of LiteLLM? https://litellm.ai/

LiteLLM, out of the box, lets you create a number of virtual API keys. Each key can be assigned to a user or a team, and can be granted access to one or more models (and their associated keys). Models are configured globally, but can have an arbitrary number of "real" and "virtual" keys.

Then you could sell access to a host of primary providers - OpenAI, Google, Anthropic, Groq, Grok, etc. - through a single API endpoint and key. Users could switch between them by changing a line in a config file or choosing a model from a dropdown, depending on their interface.

Assuming you're able to build a reasonable userbase, presumably you could then contract directly with providers for wholesale API usage. Pricing would be tricky, as part of your value prop would be abstracting away marginal costs, but I strongly suspect that very few people are actually consuming the full API quotas on these $200+ plans. Those that are are likely to be working directly with the providers to reduce both cost and latency, too.

The other value you could offer is consistency. Your engineering team's core mission would be providing a consistent wrapper for all of these models - translating between OpenAI-compatible, Llama-style, and Claude-style APIs on the fly.

Is there already a company doing this? If not, do you think this is a good or bad idea?

wild_egg · 2025-05-20T20:38:12 1747773492

Isn't that https://openrouter.ai? Or do you have something different in mind?

Ancapistani · 2025-05-20T21:00:00 1747774800

I haven't seen this, but it looks like it solves at least half of what I was thinking.

I'll investigate. Thanks!

planetpluta · 2025-05-20T20:26:48 1747772808

I think the biggest hurdle would be complying with the TOS. Imagine that OpenAI etc would not be a fan of sharing quotas across individuals in this way

Ancapistani · 2025-05-20T21:38:52 1747777132

How does it differ from pretty much every SaaS app that's using OpenAI today?

mrnzc · 2025-05-20T20:49:46 1747774186

I think what Langdock (YC-backed, https://www.langdock.com) offers might be matching to your proposal?!

Ancapistani · 2025-05-20T21:00:32 1747774832

Looks like this is at least the unified provider. I'll dig in - thanks :)

chw9e · 2025-05-20T23:54:12 1747785252

this is t3 chat from what i understand, but probably many people already doing this. this is a good approach for wrappers.

SirensOfTitan · 2025-05-20T19:27:56 1747769276

This is even the case with Gemini:

The Gemini 2.5 Pro 05/06 release by Google’s own reported benchmarks was worse in 10/12 cases than the 3/25 version. Google re routed all traffic for the 3/25 checkpoint to the 05/06 version in the API.

I’m also unsure who needs all of these expanded quotas because the old Gemini subscription had higher quotas than I could ever anticipate using.

magicalist · 2025-05-20T20:06:07 1747771567

> I’m also unsure who needs all of these expanded quotas because the old Gemini subscription had higher quotas than I could ever anticipate using.

"Google AI Ultra" is a consumer offering though, there's no API to have quotas for?

MisterPea · 2025-05-20T23:49:20 1747784960

I'm afraid they're going to lower the limits once Ultra is available. I use Gemini Pro everyday for at least 2 hours but never hit the limit

tmpz22 · 2025-05-21T01:38:42 1747791522

I have the same concerns. To push people to the ultra tier and get their bonuses their going to use dark patterns.

The only reason I maintain Claude and OpenAi subscriptions is because I expect Google to pull the rug on what has been their competitive advantage since Gemini 2.5.

Have you also noticed a degradation in quality over long chat sessions? I've noticed it in NotebookLM specifically, but not Gemini 2.5. I anticipate this to become the standard, your chat degrades subtly over time.

UncleOxidant · 2025-05-20T20:57:20 1747774640

You can just surf between Gemini, DeepSeek, Qwen, etc. using them for free. I can't see paying for any AI subscription at this point as the free models out there are quite good and are updated every few months (at least).

diggan · 2025-05-21T08:39:16 1747816756

> as the free models out there are quite good

Have you tried say O1 Pro Mode? And if you have, do you find it as good as whatever free models you use?

If you haven't, it's kind of weird to do the comparison without actually having tried it.

otabdeveloper4 · 2025-05-21T10:03:16 1747821796

Define "good". If it solves your problem then it's good.

If you don't really have a problem to solve and you're just chatting, then "good" is just, like, your vibe, man.

diggan · 2025-05-21T13:03:24 1747832604

> Define "good". If it solves your problem then it's good.

Why? Define it however you want, it's the comparison I'm interested in, regardless of the minute details of their definition.

pc86 · 2025-05-20T20:06:32 1747771592

I am willing to pay for up to 2 models at a time but I am constantly swapping subscriptions around. I think I'd started and cancelled GPT and Claude subscriptions at least 3-4 times each.

airstrike · 2025-05-20T19:38:28 1747769908

This 100%. Unless you are building a product around the latest models and absolutely must squeeze the latest available oomph, it's more advantageous to just wait a little bit.

xnx · 2025-05-20T20:06:49 1747771609

> If I get this then sooner or later OpenAI or Anthropic will be back on top.

The Gemini subscription is monthly, so not too much lock-in if you want to change later.

Wowfunhappy · 2025-05-21T00:21:08 1747786868

So subscribe for a month to whatever service is in the lead and then switch when something new comes along.

devjab · 2025-05-20T22:22:52 1747779772

I wonder why anyone would pay these days, unless its using features outside of the chatbot. Between Claude, ChatGPT, Mistral, Gemini, Perplexity, Grok, Deepseek and son on, how do you ever really run out of free "wannabe pro"?