Are there any local models that provide only very limited knowledge, e.g. autocomplete and chat for single programming language? Or is my thinking, that such limited model would be smaller and work faster even on CPU, is incorrect?
I believe Jetbrains' built in models work like that but unfortunately they force you to disable any competing AI chat plugins so you lose out on access to Claude etc. if you use them.