We wouldn't know how to construct those matrices because we don't know where in ...

jacquesm · on Sept 6, 2023

You could simulate this by selectively locking and unlocking 'banks' of weights from a larger model to keep the influence there during training and to avoid losing them. Sort of a selective write-protect.

samstave · on Sept 6, 2023

<"where in the layers what knowledge is represented."

This seems like a ripe angle for evolvement of our understanding of AIs use in LLMs... can we throw AIs at AIs (is AI synonymous to LLM?) Can we throw LLMs at LLMs? and have them recursively learn from themselves.. or is it a Rat King. AI recognize AI in the GangPlane

tinco · on Sept 7, 2023

That's not how LLM's work. LLM's complete documents, they don't make statements about LLM's unless you explain to them how they should do it and give them all the information they need. If you could extract the information from an LLM well enough to supply that to an LLM with an explanation on how to summarize the behaviour of the LLM to a human, we would have already done that to a PhD student instead. A PhD student is a little bit slower than an LLM, but they require a lot less explanation.

In any case, looking at and understanding how a neural network encodes information is like gene editing. Perhaps you could isolate a gene in the human genome that achieves something interesting like giving a child blue eyes. But even if you would do that, there's a chance you break something else if you modify that gene and give the child health risk. Since all neurons in a deep neural network are interconnected, there is a butterfly effect in it that makes them inherently somewhat of a black box.