>"Despite the effectiveness of current unlearning methods, little attention has been given to whether existing unlearning methods for LLMs truly achieve forgetting or merely hide the knowledge..."
This is a great question as applies to LLM's (and philosophically, as applies to knowledge in general)... in the context of an LLM, what is "forgetting", what is "remembering", and can things "learned" by an LLM be "unlearned", and if so how, and if so mathematically and computationally, specifically what does that mean?
And, can an LLM be made to re-teach itself things from its existing knowledge, through logical processes (implication, derivation, inductive reasoning, deductive reasoning, etc.) things that it previously forgot?
And, if so, what's the tiniest kernel of an LLM that would be able to do that, and why?
(I suspect this isn't the first paper and won't be the last paper about that subject matter...)
This is a great question as applies to LLM's (and philosophically, as applies to knowledge in general)... in the context of an LLM, what is "forgetting", what is "remembering", and can things "learned" by an LLM be "unlearned", and if so how, and if so mathematically and computationally, specifically what does that mean?
And, can an LLM be made to re-teach itself things from its existing knowledge, through logical processes (implication, derivation, inductive reasoning, deductive reasoning, etc.) things that it previously forgot?
And, if so, what's the tiniest kernel of an LLM that would be able to do that, and why?
(I suspect this isn't the first paper and won't be the last paper about that subject matter...)