You're getting modded down, but I think this is a valid concern, or at least one that that shouldn't be dismissed out of hand.
Yeah, Nazis = bad. We'll just take that as a given.
However, Nazis also did a lot of the pioneering work in rocketry and jet propulsion.
If you try to ban everything associated with Nazis, or that was performed by a Nazi, you may accidentally block things that you didn't want to.
Maybe a better strategy would be to not be so goddamned concerned about offending someone.
As far as Nazi propaganda goes, my high school had a copy of Mein Kampf sitting right there on the shelf. You can't get any more "Nazi propaganda" than that. Yet somehow none of us turned out to be, you know, Nazis.
> If you try to ban everything associated with Nazis, or that was performed by a Nazi, you may accidentally block things that you didn't want to.
No, what I'm saying is you can't ban everything associated with Nazis and nothing else in a LLM, because neural nets don't work like that and you're simply unable to ban something without influencing all the results. Which is worse than just banning info about Wernher von Braun.
I may be wrong, considering my knowledge of neural networks is limited, but so far I got one downvote and no explanation...
> No, what I'm saying is you can't ban everything associated with Nazis and nothing else in a LLM, because neural nets don't work like that and you're simply unable to ban something without influencing all the results. Which is worse than just banning info about Wernher von Braun.
You don't have to do it inside a single model, you can have a complex of models where one of them selects the almost-final output, and if it has Nazi references, raises an indicator which the system orchestrating the models recognizes and reprompts for a correction (if it is the first time) or returns a canned response (if a suitable response cannot be generated in enough tries.)
Probably still has some impact on other answers (because the detection layer is probably not 100% accurate, and if you want a near-zero miss rate on detection you probably have to accept some false positive rate), but you can get a lot closer than relying on a single pass through a single model.
I think using a less charged example than "Nazis" would probably have helped avoid down voted & lack of engagement, but I understand why you chose it as an example & personally don't take issue with it, especially because you elaborated on why you picked them as an example. Just my $0.02
Yeah, Nazis = bad. We'll just take that as a given.
However, Nazis also did a lot of the pioneering work in rocketry and jet propulsion.
If you try to ban everything associated with Nazis, or that was performed by a Nazi, you may accidentally block things that you didn't want to.
Maybe a better strategy would be to not be so goddamned concerned about offending someone.
As far as Nazi propaganda goes, my high school had a copy of Mein Kampf sitting right there on the shelf. You can't get any more "Nazi propaganda" than that. Yet somehow none of us turned out to be, you know, Nazis.