Happy to see this type of work that is truly open source and commercially usable. Is this the entire corpus or a subset? Do you intend to release any new iterations?
I've been thinking of starting similar efforts at another BigCorp by hosting a UL2 or GPT-J instance.
I've been thinking of starting similar efforts at another BigCorp by hosting a UL2 or GPT-J instance.