People pay for convenience, that's true - and part of the equation here. Agreed! The approach is to make data capturing as convenient as possible, where you just paste in api key + base url into your existing code, and you gather all your runs. And then, Reinforcement Learning is hard to figure out - so one of the goals is to commoditize Reinforcement Learning, what you're alluding to. In its iteration, the platform is released with verifiable mode where Augento takes all the headache of GPU infrastructure, GRPO implementation, training configurations and dataset curation away - you just select your gathered runs, and start the training. But we'll go past that, and expand Augento into a platform for alignment and self-learning.
Tl;DR Yes, indeed! We designed Augento with convenience in mind.
In a sense: You are not wrong! But when we got started we thought it is way easier than it actually was. Procuring powerful GPUs alone is difficult, collecting proper data too. But of course you can still do everything yourself. If you want to give this a try yourself, I would recommend taking a look at torchtune (https://github.com/pytorch/torchtune).
People not in the field have no idea just how distorted the market is right now.
I was working at a startup doing end to end training for modified BERT architectures and everything from buying a GPU - basically impossible right now, we ended up looking at sourcing franken cards _from_ China.
To the power and heat removal - you need a large factories worth of power in the space of a small flat.
To pre-training something that's not been pre-trained before - say hello to throwing out more than 80% of pretraining runs because of a novel architecture.
Was designed to burn money as fast as possible.
Without hugely deep pockets, with a contract from NVidia, and with a datacenter right next to a nuclear power plant you can't compete at the model level.
You are right. If you want/can pay out of your own pocket, RunPod (https://www.runpod.io) deserves a shoutout here. We rented GPUs from them (they have them and they are cheaper and more available than Lambda Labs) until we convinced AWS to give us capacity blocks.
But in general the prices for GPUs as well as their scarcity is really crass and unlike mining you can't really use gaming or franken cards as a fallback. I can count the GPUs we can do this on (even for relatively small models) on one hand.
Yes, you could do that.
However, you would have created a different platform than Augento. Maybe we should make the distinction clearer though.
The blog article you are referring to uses another method to fine-tune models that many other big platforms like Together AI (and even OpenAI themselves) are already supporting: Supervised Fine Tuning (SFT). We are doing Reinforcement Learning using GRPO instead.
SFT has the big caveat that it requires good prompt-completion datasets to work, which are rare/hard to curate for many use cases. For GRPO, you (the programmer) don’t even need to know what the correct answer is as long as you can decide if it’s a good answer (P?NP) at its heart, essentially.