Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Happy to see this type of work that is truly open source and commercially usable. Is this the entire corpus or a subset? Do you intend to release any new iterations?

I've been thinking of starting similar efforts at another BigCorp by hosting a UL2 or GPT-J instance.



15k is the entire corpus we have right now. Hopefully others can join up in releasing additional samples that can be merged in over time.

We'll definitely keep iterating on Dolly and releasing everything openly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: