Sounds interesting. Do you have documentation on how you built the whole system?

namanyayg · 2025-01-05T16:05:17 1736093117

I'll write something up, what are you curious about exactly?

JTyQZSnP3cQGa8B · 2025-01-05T16:14:04 1736093644

> Do you have documentation on how you built the whole system

Or any actual "proof" (i.e. source code) that your method is useful? I have seen a hundred articles like this one and, surprise!, no one ever posts source code that would confirm the results.

namanyayg · 2025-01-05T16:15:09 1736093709

I have been trying to figure out how to publish evals or benchmarks for this.

But where can I get high quality data of codebases, prompts, and expected results? How do I benchmark one codebase output vs another?

Would love any tips from the HN community

JTyQZSnP3cQGa8B · 2025-01-05T16:31:49 1736094709

That's the problem with people who use AI. You think too much and fail to deliver. I'm not asking for benchmarks or complicated stuff, I want source code, actual proof that I can diff myself. Also that's why the SWE is doomed because of AI, but that's another story.

techn00 · 2025-01-05T16:33:53 1736094833

the implementations of getFileContext() and shouldStartNewGroup().