My surface-level reading of these two sections is that the 800k samples come fro...

deepGem · 2025-01-29T23:56:29 1738194989

That's what I am gathering as well. Where is OpenAI going to have substantial proof to claim that their outputs were used ?

The reasoning prompts and answers for SFT from V3 you mean ? No idea. For that matter you have no idea where OpenAI got this data from either. If they open this can of worms, their can of worms will be opened as well.

IAmGraydon · 2025-01-30T06:18:25 1738217905

>Where is OpenAI going to have substantial proof to claim that their outputs were used ?

I assume in their API logs.

rekttrader · 2025-01-30T06:54:37 1738220077

Shibboleths in output data