Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I see. How is this not going to get run over immediately by big players? Google's diffusion model is already in the wings, and it's both wicked fast and ~flash-lite intelligent.





you could make the argument about any startup really. To me its the same reason they don't build the foundational model for legal, for sales, etc.. - everything comes at a cost. Allocating researcher time to this is attention not spent on the general frontier model - losing 1-2% there is the difference of billions of dollars for them

You can run more intelligent traditional LLMs at higher speeds than the Google diffusion model. Even then, it runs nowhere near 4500tok/s, and such small models generally suck in terms of accuracy compared to a specialized, fine tuned one.

Conventional tools are also a threat. Nothing about this problem is AI-specific.

It does require a level of contextual awareness, fuzziness and robustness against crazy inputs that in my mind would be very hard to achieve using classical approaches.

Context awareness, yep. Response to fuzzy inputs... idunno, why does it need that?

The thing I think is really silly is that it tries to make incremental writes to a flat file really fast, which is an impossible goal. As the file gets bigger your writes will just get slower and slower and slower at a rate that increases linearly with the size of the file.


LLMs will generate unpredictable, very humanlike code edits in this form. They might use comments like "same function as above", "rest of the function with similar changes", "function ABC is no longer needed", "function ABC same as above", etc. Your code edit model must resolve all of these, or flag an error in case of too much ambiguity. I would think a classical algorithm would have a lot of trouble differentiating between, for example, "function replaced with a comment because the LLM wanted to remove it" and "function replaced with a comment because the LLM wanted to keep it the same".

The other half of your comment is true, but we typically have a ceiling on the reasonable size of a code file. For decades we've had the conventional wisdom to refactor files that are beyond some threshold LoC (be it 200, 500 or whatever it is.) If that is sufficiently fast, you can parallelize such operations and provide a maximum edit time regardless of the change size.


I agree that sorting something that messy can't easily be done with a heuristic expert system, but I'm registering my concern that the whole approach is predicated on first making a huge mess with one LLM then cleaning up the mess with another.

The classical approach is more like "change the definition of the problem until you don't need to make a big mess in the first place"


This might be a controversial take, but this approach is just plain old engineering applied to LLMs.

Instead of making an LLM perform both an accurate code edit AND follow a strict output schema, you split that into two problems: accurate code edit with a lax output schema, then application of that lax output schema to the original file. You can then use different models for the two tasks, reducing your probability of failure.


Yeah, it's just that I'm building things to last the next 50 years. Of course you can build a skyscraper on a foundation of wet noodles and it would still be (impressive) engineering, but the structure would not stand the test of time.

So yeah, it's a clever approach, and useful even, but there's no way I can see that that such a noodl-y hack could become the bedrock for all other infrastructure


precisely. we work well on files up to 2k lines. It's hard to wrap your head around at first, but code merge has 100s of edge cases to deal with. Its the perfect application for a model

Google's a great tech organization but they generally don't create dominant tech products like they used to back in the Maps / Mail days (this is nearly two decades ago).

Google wrote AKYNIA. OpenAI wrote ChatGPT.


factual



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: