Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Very interested in the expansion of RL for transformers, but I can't quite tell what this project is.

Could you please add links to the documentation to the readme where it states "It includes detailed documentation".

Also maybe DPO should use the DDPG acronym instead so your repos Deterministic Policy Optimization isn't confused for trl's Direct Preference Optimization.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: