Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Intuitively I've always been a bit skeptical of quantization. Wouldn't there be a tiny loss in precision by doing this type of quantization? I could imagine the error function increasing by utilizing these types of techniques.


John Carmack pointed out (and I learned it here at HN) that what training really needs is the *sign" of each individual gradient parameter. I.e., you can quantize gradient to -1, 0 and 1 and still have neural network learn much of the dataset.


Why isn't John Carmack working for OpenAI? Hell, why did he waste years at Meta to work on a VR headset and NOT AI? He even announced he wants to focus on AGI but he missed out on literally all the action.


he has his own AGI startup now https://dallasinnovates.com/john-carmacks-keen-technologies-...

TBH I think they won't get anywhere. Doing good game engine work... why that would translate to AGI?


That game engine was over 3 decades ago! John is one of the sharpest minds I've ever seen, if he's passionate on AGI, he surely has much deeper understanding what he's doing than the AI trendies on social media.


Let me introduce you to the wonderful game that is The Talos Principle: https://en.wikipedia.org/wiki/The_Talos_Principle

It discusses whether it is possible to evolve AGi using... computer game engine! And that is John's bread and butter.


Wow! Is there a link to read up more on this?


  > It is interesting that things still train even when various parts are pretty wrong — as long as the sign is right most of the time, progress is often made.
https://forums.fast.ai/t/how-to-do-reproducible-models-and-u...


They seem to be doing training with higher precision. The optimizer is keeping a copy.


It does increase the “error” (meaning it is less likely to predict the next word when compared against a dataset) but the losses are lower than your intuition would guide you to believe.


Quantization does reduce quality of the outputs. But the point is that you save enough memory doing so that you can cram a larger model into the same hardware, and this more than compensates for lost precision.


Yes each weight will not be able to "learn" as much if it has less bits of precision. But the idea is that you can use more weights, and the big question is whether these low-precision weights can make the model more accurate, as a whole.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: