Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh it's the KL term sorry - beta * KL ie they set beta to 0.

The goal of it was to "force" the model not to stray to far away from the original checkpoint, but it can hinder the model from learning new things



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: