This bit about EURISKO exploiting its own heuristic scoring mechanism is cute: *... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

panic on Nov 15, 2018 | parent | context | favorite | on: Some documents on AM and EURISKO

This bit about EURISKO exploiting its own heuristic scoring mechanism is cute:

> One of the first heuristics that EURISKO synthesized (H59) quickly attained nearly the highest Worth possible (999). Quite excitedly, we examined it and could not understand at first what it was doing that was so terrific. We monitored it carefully, and finally realized how it worked: whenever a new conjecture was made with high worth, this rule put its own name down as one of the discoverers! It turned out to be particularly difficult to prevent this generic type of finessing of EURISKO'S evaluation mechanism. Since the rules had full access to EURISKO'S code, they would have access to any safeguards we might try to implement. We finally opted for having a small 'meta-level' of protected code that the rest of the system could not modify.

khafra on Nov 15, 2018 [–]

It's not just cute, it's an open problem in the theory of recursively self-modifying AI design--there's no general solution to an AI hijacking its own reward channel, so far. https://www.lesswrong.com/posts/upLot6eG8cbXdKiFS/reward-fun... is the most recent thing I know of in the area.

Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact