If somebody can show me a coding task that LLMs have successfully done that isn't an interview question or a documentation snippet, I might start to value it.
Spending huge amount of resource to be a bit better at autocompleting code doesn't have value to me. I want it to solve significant problems, and it's looking like it can't do it and scaling it to be able to is totally impractical.
> In aggregate, training all 9 Code Llama models required 400K GPU hours of computation on hardware of type A100-80GB (TDP of 350-400W).
That is:
* 45⅔ GPU years
* 160 MWh or...
* 45 average UK homes annual electric consumption
* 18 average US homes
* 64 average drivers annual milage in an EV.
...and that's just the GPUs. Add on all the rest of the system (s).
In the grand scheme of things it's ancient history, but https://code-as-policies.github.io/ works by generating code then executing it. That's worth running at. The code generation in that paper was done on code-davinci-002, which is (or rather was - it's deprecated) a 15B GPT-3 model. I've not done it yet, but I'd expect the open source 7B code completion models to be able to replicate it by now.
Spending huge amount of resource to be a bit better at autocompleting code doesn't have value to me. I want it to solve significant problems, and it's looking like it can't do it and scaling it to be able to is totally impractical.
> In aggregate, training all 9 Code Llama models required 400K GPU hours of computation on hardware of type A100-80GB (TDP of 350-400W).
That is: * 45⅔ GPU years * 160 MWh or... * 45 average UK homes annual electric consumption * 18 average US homes * 64 average drivers annual milage in an EV.
...and that's just the GPUs. Add on all the rest of the system (s).