This is a very helpful observation. I've been debating the idea of building tier...

a_bonobo · on March 28, 2023

This would also explain the ease at which ChatGPT gets rid of escapes/bad prompts - they have an additional layer that assesses whether the question could be, for example, racist, and then spits out a 'Sorry as a language-model I am not trained to answer this kind of question'. No need to retrain the main 14B transformer model.