It is pleasantly surprising to see how close your pipeline is to mine. Essential...

It is pleasantly surprising to see how close your pipeline is to mine. Essentially a good representation layer - usually based on BERT - like minilm or MPNet, followed by a calibrated linear SVM. Sometimes I replace the SVM with LightGBM if I have non-language features.

If I am building a set of models for a domain, I might fine-tune the representation layer. On a per-model basis I typically just train the SVM and calibrate it. For the amount of time this whole pipeline takes (not counting the occasions when I fine-tune), it works amazingly well.