> How do you prove in ~O(1) time that someone did some operation with their GPU?...

refulgentis · 2025-05-12T18:22:36 1747074156

Trying again, apologies:

- Minimizing loss could be a useful heuristic on a base model. Here, we expect the distribution to be different as we are only doing RL. Measuring loss means we're measuring the difference against the base model inputs: a non-goal, we expect reasoning post RL-training to look quite different from a web scrape.

Let's set that aside. Let's say lower loss = model improved.

- Checking the loss requires the entire dataset used to train the base model + forward pass. That’s O(N·d) where N is samples, d is model size. This takes us from "cool demo of RL can be done on the edge with little benefit" to "we're shipping around terabytes of data constantly among clients"

- Proof of work as a technical term is different from proof of work as a colloquial term: the former is a cryptographic puzzle whose solution is universally and instantly checkable, while the latter just means “I can show I did something,” with no strict guarantee or uniqueness. Randomly perturbing one parameter could show "proof of work" without the work we actually wanted to be done, being done.

- Early in base model training, shaving 0.01 off the loss is easy. Later, impossible. In an RL environment, we're expecting some to go bad. In our interpretation of "loss decrease means model better means you did work", that would mean loss would increase -- that is how it learns in an RL environment. However, that does not mean no work is done.

bastawhiz · 2025-05-12T22:34:56 1747089296

That's far from O(1). Now you need to transfer the weights back and test them.

naasking · 2025-05-13T16:10:24 1747152624

> That's far from O(1). Now you need to transfer the weights back and test them.

I think what matters most is that the verification is much, much cheaper than the calculation itself to prove that work was done, it doesn't explicitly have to be O(1), eg. the magnitude difference has to exceed a certain threshold to make proof of work viable.