Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
lxe
on April 17, 2023
|
parent
|
context
|
favorite
| on:
MiniGPT-4
Why Vicuna over something like oasst-pythia? Would love to see a table comparing all the new models side by side.
lhl
on April 17, 2023
|
next
[–]
Fabrice Bellard has run a standard set of benchmarks w/ lm-eval on a big chunk of open models here:
https://bellard.org/ts_server/
- Flan T5 XXL and GPT-NeoX 20B both outperform Pythia 12B on average (LLaMA 13B+ tops the charts).
GaggiX
on April 17, 2023
|
prev
[–]
All Pythia models were trained on 300B tokens, LLaMa models were trained on 1/1.4T tokens.
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: