Yeah, they should! Not that the missile then makes a 180° turn to "return to sen...

culi · 2025-01-27T18:28:12 1738002492

The code is open sourced

jprete · 2025-01-27T18:32:32 1738002752

There's no meaningful inspection of LLM code, because the real code is the model weights.

mschoening · 2025-01-27T18:32:28 1738002748

See Sleeper Agents (https://arxiv.org/abs/2401.05566).

cosmojg · 2025-01-27T20:17:29 1738009049

Who in their right mind is going to blindly take the code output by a large language model and toss it on a cruise missile? Sleeper agents are trivially circumvented by even a modicum of human oversight.

carimura · 2025-01-27T18:32:21 1738002741

but what about training data?

culi · 2025-01-28T07:23:37 1738049017

The weights and data pipeline are open sourced and described explicitly in the paper they published. The non-reasoning data isn't nearly as interesting as the reasoning data though