Mixtral > Codellama > DeepSeek Coder. Very weird model, writes super long commen...

apawloski · on Feb 1, 2024

Which mixtral are you using with ollama? I have 32GB M1 MacBook Pro and can’t seem to load it / get any time-realistic responses

chrisweekly · on Feb 1, 2024

tangent: I often have a hard time disambiguating the ">" in comparisons like yours: (A) greater than (ie, Mixtral superior to DeepSeek, w/ Codellama in between) vs (B) arrow/sequence (ie, start w/ Mixtral, progress to Codellama, finally land on DeepSeek as the culmination).

I'd love to hear of a less ambiguous way to represent these.

mcny · on Feb 1, 2024

I personally use a -> b -> c for sequence, a > b >> c for nested hierarchy.

In this context, I read it as a is better than b which is better than c.

mooreds · on Feb 1, 2024

How does

Mixtral >= Codellama >= DeepSeek

Work for you?

illusive4080 · on Feb 1, 2024

Sadly that implies that all 3 could be equal or very close to equal.

addandsubtract · on Feb 1, 2024

Which they can be, when you ask them for a console.log. It does, however, get rid of the ambiguity of case (B) that GP mentioned.

babyshake · on Feb 1, 2024

Just curious, why do you think your evaluation is the opposite of what is shown in this evaluation? https://evalplus.github.io/leaderboard.html

svantana · on Feb 1, 2024

Not OP but these evaluations should be taken with a huge grain of salt. It's almost impossible to rule out data leakage of open datasets such as HumanEval.

high_priest · on Feb 1, 2024

Mixtral looks interesting, but I haven't dabbled in locally hosted LLMs.

Would you mind linking to a concise text which could lead me through setting up Mixtral on my own machine?

djjtiro8 · on Feb 1, 2024

It took me just as long to setup llama.cpp as it did to get other tools working well (ollama or other frontend that abstract away the actual config)

It’s always read HOWTO, attempt to recreate state, so I prefer sticking with low level where I also learn a bit more about the internals

C/C++ user friendliness has come as far as all the other languages and the ecosystems. Really the only reason to “fear” it is propagated memes to do so. It’s not a gun.

So I’d suggest just compile llama.cpp and install huggingface-cli to download GGUF format models, which is all ollama is doing but with even more dependencies and much more opaque outcome

findjashua · on Feb 1, 2024

LM Studio is the easiest way to do it

Sabinus · on Feb 1, 2024

That's what I've been playing with. I can load 9 layers of a mixtral descendant into the 12gb vram for GPU and the rest into ~28gb ram for the CPU to work on. It chugs the system sometimes but the models are interestingly capable.

aussieguy1234 · on Feb 1, 2024

I'm using Mixtral, but rather than shell out for a gaming laptop with an expensive GPU, I simply run it via Together.ai APIs which works out alot cheaper. There's a few similar services out there.

baq · on Feb 1, 2024

Had zero experience, too. Turns out ollama does everything, literally. You just tell it to run a model and wait a bit for it to download. One (1) shell command total.

elwebmaster · on Feb 1, 2024

Ollama gui in WSL2

djjtiro8 · on Feb 1, 2024

Agreed. Mixtral is freaky good. I have one 3090 and it flies.

DSC on the other hand crawls its way to poor answers and injected snippets of unrelated code into output for me.

findjashua · on Feb 1, 2024

do you find Mixtral to be better than the new 70B one that Meta released a couple days back as well?

findjashua · on Feb 1, 2024

did a comparison on LM Studio - the answers are eerily similar, but Mixtral is way way faster. Codellama-70B is slow to the point of being unusable.

(M1 Max, 64 GB RAM)

maxlamb · on Feb 1, 2024

Is Mixtral better overall or for coding specifically (or both)?

_boffin_ · on Feb 1, 2024

Reduce the repetition penalty to 1 and that should fix it.