Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm pretty sure the GPT model is huge and does not fit on any conventional GPU. Even if they open-sourced the weights, I don't think most people would be running it at home.

Also regarding the text limits, AFAIK, there's just an inherent limit in the architecture. Transformers are trained on finite-length sequences (I think their latest uses 4096 tokens). I have been trying to understand how ChatGPT seems to be able to manage context/understanding beyond this window length



I don't think ChatGPT does. I have had long discussions with it, with some rules agreed upon in the beginning, and at some point it clearly begins to forget the exact rules and has to be reminded of them.

(Specifically, AI Dungeon type games where ChatGPT is the DM and the human the protagonist, or vice versa. The most common failure mode seems to be that it forgets whether it's playing the DM or the protagonist. To be fair, it performs admirably well despite the limitations.)


In a previous thread (which I can not find right now) the recommendation was to either ask it to summarize what happened earlier, or do this job yourself from time to time.


I read that it just re-reads the discussion so far every time you submit. So it must hit a limit of what it can remember since they limit the amount of training tokens it can read for a submission.


Yes, I know. It’s a pure function with no mutable state.


Is Chat-GTP it's own model? I thought ChatGTP was just GTP-3 with an easier to use interface.


It's based on GPT-3 but is specifically amended to predict sequences that look like coherent dialogue, by an adversarial model that has been partially trained by humans. The resulting model is also quite a bit smaller than the full GPT-3. It's much more difficult to make GPT-3 engage in reasonable dialogue than ChatGPT.


Yeah it wouldn't fit. GPT3 is 175B params, so even if you use 8 bit for each weight, you need 175×10^9÷2^30 = 163GiB of memory.


https://www.reddit.com/r/ChatGPT/comments/zhzjpq/comment/izo...

>It's around 500gbs and requires around 300+gbs of vram from my understanding and runs on one of the largest super computers in the world. Sable diffusion has around 6 billion parameters gpt-3/chatgpt has 175 billion.


Wouldn’t that be possible with about 4 powerful GPUs? Or does it not work like that?


Possibly, but that would be 10 of thousands of dollars worth of GPUs.


Silly question: how does OpenAI host/serve it?


I think on professional hardware you can get 80G of memory per GPU and they can likely do memory pooling.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: