This is completely irrelevant without knowing if you are effectively prompting each model. Your workflow may just be suitable for a particular model and not others. And tuning a workflow for each model is tedious. I seriously doubt there is ANY class of problem DSR1 can solve that OAI's third tier model can't at this point (o4-mini).