Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"That’s dynamically decided during training and not set before, right?"

^ right. I can't recall off the top of my head, but there was a recent paper that showed if you tried dictating this sort of thing the perf fell off a cliff (I presume there's some layer of base knowledge $X that each expert needs)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: