True - I think the key is "empirical probability", so just keying on the combina...

tysam_and · on Feb 25, 2023

Oh yeah, good points. And citations! Thanks so much, really appreciated.

I'm doing work currently on this and hopefully it will yield some fruit -- at least, the work relating on the internals of what's happening inside of Transformers. Just from slowly absorbing the research over the years I have a few gut hypotheses about what is happening. Hopefully any of the work I do will yield some fruit, I think a good chunk of what is happening is surprisingly standard, just hidden due to the complexity of millions of parameters slinging information hither and yon.

Thanks again for putting all of the thought and effort into your post, I really appreciate it. This is something I love about being here in this particular place! :D

xg15 · on Feb 25, 2023

That sounds extremely interesting! I'm still trying to understand the basic transformer architecture so far, but I think those kinds of insights are exactly what we'll need if we don't want the whole field to degrade to alchemy with no one understanding what is going on.

Do you have a blog?

tysam_and · on Feb 26, 2023

I do not, (unfortunately? Maybe I need one?), though if you want you can follow me on GitHub at https://github.com/tysam-code. I try to update/post to my projects regularly, as I'm able to. It alternates between different ones, though LLMs are the main focus right now, if I can get something to an (appropriately and healthfully) publishable state. :3 :)

I try to be skeptical about certain possibilities within the field, but I do feel bullish about us being able to at least tease out some of the structure of what's happening due to how some properties of transformers work. At least, I think it'll be easier than figuring out how certain brain informational structures work (which has happened to some tiny degree, and will be I think even cooler in the future)! :) XD :DDDD :)