The problem with all of these is that SOTA models keep changing. I thought about getting OpenAI's Pro subscription, and then Gemini flew ahead and was free. If I get this then sooner or later OpenAI or Anthropic will be back on top.
I wonder if there's an opportunity here to abstract away these subscription costs and offer a consistent interface and experience?
For example - what if someone were to start a company around a fork of LiteLLM? https://litellm.ai/
LiteLLM, out of the box, lets you create a number of virtual API keys. Each key can be assigned to a user or a team, and can be granted access to one or more models (and their associated keys). Models are configured globally, but can have an arbitrary number of "real" and "virtual" keys.
Then you could sell access to a host of primary providers - OpenAI, Google, Anthropic, Groq, Grok, etc. - through a single API endpoint and key. Users could switch between them by changing a line in a config file or choosing a model from a dropdown, depending on their interface.
Assuming you're able to build a reasonable userbase, presumably you could then contract directly with providers for wholesale API usage. Pricing would be tricky, as part of your value prop would be abstracting away marginal costs, but I strongly suspect that very few people are actually consuming the full API quotas on these $200+ plans. Those that are are likely to be working directly with the providers to reduce both cost and latency, too.
The other value you could offer is consistency. Your engineering team's core mission would be providing a consistent wrapper for all of these models - translating between OpenAI-compatible, Llama-style, and Claude-style APIs on the fly.
Is there already a company doing this? If not, do you think this is a good or bad idea?
I think the biggest hurdle would be complying with the TOS. Imagine that OpenAI etc would not be a fan of sharing quotas across individuals in this way
The Gemini 2.5 Pro 05/06 release by Google’s own reported benchmarks was worse in 10/12 cases than the 3/25 version. Google re routed all traffic for the 3/25 checkpoint to the 05/06 version in the API.
I’m also unsure who needs all of these expanded quotas because the old Gemini subscription had higher quotas than I could ever anticipate using.
I have the same concerns. To push people to the ultra tier and get their bonuses their going to use dark patterns.
The only reason I maintain Claude and OpenAi subscriptions is because I expect Google to pull the rug on what has been their competitive advantage since Gemini 2.5.
Have you also noticed a degradation in quality over long chat sessions? I've noticed it in NotebookLM specifically, but not Gemini 2.5. I anticipate this to become the standard, your chat degrades subtly over time.
You can just surf between Gemini, DeepSeek, Qwen, etc. using them for free. I can't see paying for any AI subscription at this point as the free models out there are quite good and are updated every few months (at least).
I am willing to pay for up to 2 models at a time but I am constantly swapping subscriptions around. I think I'd started and cancelled GPT and Claude subscriptions at least 3-4 times each.
This 100%. Unless you are building a product around the latest models and absolutely must squeeze the latest available oomph, it's more advantageous to just wait a little bit.
I wonder why anyone would pay these days, unless its using features outside of the chatbot. Between Claude, ChatGPT, Mistral, Gemini, Perplexity, Grok, Deepseek and son on, how do you ever really run out of free "wannabe pro"?