Why authors miss to compare with Phi-2?

hcarlens · on Jan 17, 2024

Agreed, and not only do they not compare their model to Phi-2 directly, the benchmarks they report don't overlap with the ones in the Phi-2 post[1], making it hard for a third party to compare without running benchmarks themselves.

(In turn, in the Phi-2 post they compare Phi-2 to Llama-2 instead of CodeLlama, making it even harder)

[1]: https://www.microsoft.com/en-us/research/blog/phi-2-the-surp...