Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
meow_mix
on March 14, 2023
|
parent
|
context
|
favorite
| on:
GPT-4
Reinforcement learning w/ human feedback. What u guys are describing is the alignment problem
mistymountains
on March 14, 2023
[–]
That’s just a supervised fine tuning method to skew outputs favorably. I’m working with it on biologics modeling using laboratory feedback, actually. The underlying inference structure is not changed.
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: