> Large corps likely have some huge codebases of overengineered spagetti code, which are hardly comprehendable by LLMs.
I think it's the other way around, though.
Those code monstrosities aren't comprehendible by humans, especially after the wanton RIFs that have happened in the past couple years that have cut loose a lot of people who know where the bodies are buried.
However, with copilot you can just figuratively walk up to any repo and ask "@workspace what's going on in this codebase" and it'll tell you. From experience I can say this can deliver results. Downright rotten code that would've taken me a good week to figure out can be figured out in an hour. It's damn near witchcraft.
I have never seen this work, at all, with a large codebase. Can you give a specific example of a large repo where this produced useful/coherent results?
Id be curious as well but really doubt AI could make much headway into a legacy codebase I recently spent some time on. There is tacit knowledge that AI cannot pick up on, many things are named wrong, there are patches and workarounds that barely make any sense, architecture inconsistencies and transitions in style from 20 years ago to now. Maybe if there was some documentation and explanatory notes. Without needing to fill the gaps by simply asking the prompter for more information, im inclined to suspect the same old eagerness to bullshitting that LLMs have been programed in.
I have seen multiple people claim that {Copilot|Claude|a local Llama model} has been great for them at understanding large codebases, but at least to date, I have yet to actually have anyone provide a concrete example when I ask. Maybe others have a different idea of what constitutes "large."
At my job, our main repo is over 300k lines of just Ruby code, plus a bunch of JS, ERB templates, and other stuff. Every AI tool I've thrown at it is great at making surgical edits to single files (or small groups of files) but completely chokes if you ask it a question that requires it to understand context across the repo. I'm always hoping that I'm just using the tools wrong, but so far that doesn't seem to be the case.
I think it's the other way around, though.
Those code monstrosities aren't comprehendible by humans, especially after the wanton RIFs that have happened in the past couple years that have cut loose a lot of people who know where the bodies are buried.
However, with copilot you can just figuratively walk up to any repo and ask "@workspace what's going on in this codebase" and it'll tell you. From experience I can say this can deliver results. Downright rotten code that would've taken me a good week to figure out can be figured out in an hour. It's damn near witchcraft.