> While I haven't tested it extensively, 70B model is supposed to rival Chat GPT 3.5 in most areas, and there are now some new fine-tuned versions that excel at specific tasks
That has been my experience. Having experimented with both (informally), Llama 2 is similar to GPT-3.5 for a lot of general comprehension questions.
GPT-4 is still the best amongst the closed-source, cutting edge models in terms of general conversation/reasoning, although 2 things:
1. The guardrails that OpenAI has placed on ChatGPT are too aggressive! They clamped down on it quite hard to the extent that it gets in the way of a reasonable query far too often.
2. I've gotten pretty good results with smaller models trained on specific datasets. GPT-4 is still on top in terms of general purpose conversation, but for specific tasks, you don't necessarily need it. I'd also add that for a lot of use cases, context size matters more.
To your first point, I was trying use ChatGPT to generate some examples of negative interactions with customer service to show sentiment analysis in action for a project I was working on.
I had to do all types of workarounds for it to generate something useful without running into the guardrails.
> I apologize, but I don't understand what you mean by "fika nu." Could you please provide more context or clarify your question so I can better assist you?
That has been my experience. Having experimented with both (informally), Llama 2 is similar to GPT-3.5 for a lot of general comprehension questions.
GPT-4 is still the best amongst the closed-source, cutting edge models in terms of general conversation/reasoning, although 2 things:
1. The guardrails that OpenAI has placed on ChatGPT are too aggressive! They clamped down on it quite hard to the extent that it gets in the way of a reasonable query far too often.
2. I've gotten pretty good results with smaller models trained on specific datasets. GPT-4 is still on top in terms of general purpose conversation, but for specific tasks, you don't necessarily need it. I'd also add that for a lot of use cases, context size matters more.