I personally treat the LLM as a very junior programmer. He's willing to work, will take instructions, but his knowledge of the codebase, and patterns we use is lacking strongly. So it needs a LOT of handholding, very clear instructions, description of potential pitfalls, and smaller, scoped tasks, and reviewed carefully to catch any straying off pattern.
Also, I make it work the same way I do: I first come up with the data model until it "works" in my head, before writing any "code" to deal with it. Again, clear instructions.
Oh another thing, one of my "golden rule" is that it needs to keep a block comment at the top of the file to describe what's going on in that file. It acts as a second "prompt" when I restart a session.
It works pretty well, it doesn't appear as "magic" as the "make it so!" approach people think they can get away with, but it works for me.
But yes, I still also spend maybe 30% of the time cleaning up, renaming stuff and do more general rework of the code before it comes "presentable" but it still allows to work pretty quickly, a lot quicker than if I were to do it all by hand.
I think "junior programmer" (or "copilot") oversells the AI in some cases and undersells it in others. It does forget things that a regular person wouldn't, and it does very basic coding mistakes sometimes. At the same time it's better than me at some things (getting off-by-one errors when dealing with algorithms that work on arrays). It also has encyclopedic knowledge about basically anything out there on the internet. Red-black Trees? Sure thing. ECS systems for game programming? No problemo, here are the most used libraries.
I have ended up thinking about it as a "hunting dog". It can do some things better than me. It can get into tiny crevasses and bushes. It doesn't mind getting wet or dirty. It will smell the prey better than me.
But I should make the kill. And I should be leading the hunt, not the other way around.
The difference between LLM and a very junior programmer: junior programmer will learn and change, LLM won't change! The more instructions you put in the prompt, the more will be forgotten and the more it will bounce back to the "general world-wide average". And on next prompt you must start all over again... Not so with junior programmers ...
This is the only thing that makes junior programmers worthwhile. Any task will take longer and probably be more work for me if I give it to a junior programmer vs just doing it myself. The reason I give tasks to junior programmers is so that they eventually become less junior, and can actually be useful.
Having a junior programmer assistant who never gets better sounds like hell.
The tech might get better eventually, it has gotten better rapidly to this point and everyone working on the models are aware of these problems. Strong incentive to figure something out.
Ahaha you likely haven't seen as many Junior Programmer as I have then! </jk>
But I agree completely some juniors are a pleasure to see bloom, it's nice when one day you see their eye shine and "wow this is so cool, never realized you made that like THAT for THAT reason" :-)
The other big difference is that you can spin up an LLM instantly. You can scale up your use of LLMs far more quickly and conveniently than you can hire junior devs. What used to be an occasional annoyance risks becoming a widespread rot.
My guess is that you're letting the context get polluted with all the stuff it's reading in your repo. Try using subagents to keep the top level context clean. It only starts to forget rules (mostly) when the context is too full of other stuff and the amount taken up by the rules is small.
Definitely. To be honest I don't think LLM's are any different from googling and copying code off the Internet. Its still up to the developer to take the code, go over it, make sure its doing what its supposed to be doing (and only that) etc.
As for the last part, I've recently been getting close to 50 and my eyes aren't what they used to be. In order to fight off eye-strain I now have to tightly ration whatever I do into 20 minute blocks, before having to take appropriate breaks etc.
As a result of that time has become one of the biggest factors for me. An LLM can output code 1000x faster than a human, so if I can wrangle it somehow to do whatever basics for me then its a huge bonus. At the moment I'm busy generating appropriate struct of arrays for SIMD from input AoS structs, and I'm using Unity C# with LINQ to output the text (I need it to be editable by anyone, so I didn't want to go down the Roslyn or T4 route).
The queries are relatively simple, take the list of data elements and select the correct entries, then take whatever fields and construct strings with them. Even so, copying/editing them takes a lot longer than me telling GPT to select this, exclude that and make the string look like ABC.
I think there was a post yesterday about AI's as HUDs, and that makes a lot of sense to me. We don't need an all-powerful model that can write the whole program, what we need is a super-powered assistant that can write and refactor on a very small and local scale.
I personally see the LLM as a (considerably better) alternative to StackOverflow. I ask it questions, and it immediately has answers for my exact questions. Most often I then write my own code based on the answer. Sometimes I have the LLM generate functions that I can use in my code, but I always make sure to fully understand how it works before copy-pasting it into my codebase.
But sometimes I wonder if pushing a +400.000 lines PR to an open-source project in a programming language that I don't understand is more beneficial to my career than being honest and quality-driven. In the same way that YoE takes precedence over actual skill in hiring at most companies.
Unlike stack overflow, if it doesn’t know the answer it’ll just confidently spit out some nonsense and you might fall for it or waste a lot of time figuring out that it’s clueless.
You might get the same in Stack Overflow too, but more likely I’ve found either no response or, or someone pretty competent actually does come out of the woodworks.
I find success basically limiting it to the literal coding but not the thinking - chop tasks down to specific, constrained changes; write detailed specs including what files should be changed, how I want it to write the code, specific examples of other places to emulate, and so on. Doesn’t have to be insanely granular but the more breadcrumbs the higher chance it’ll work, you find a balance. And whatever it produces, I git add -p one by one to make sure each chunk makes sense.
More work up front and some work after, but still saves time and brain power vs doing it all myself or letting it vibe out some garbage.
To a certain extent you are probably still not using it optimally if you are still doing that much work to clean it up. We, for example, asked the LLM to analyze the codebase for the common patterns we use and to write a document for AI agents to do better work on the codebase. I edited it and had it take a couple of passes. We then provide that doc as part of the requirements we feed to it. That made a big difference. We wrote specific instructions on how to structure tests, where to find common utilities, etc. We wrote pre-commit hooks to help double check its work. Every time we see something it’s doing that it shouldn’t, it goes in the instructions. Now it mostly does 85-90% quality work. Yes it requires human review and some small changes. Not sure how the thing works that it built? Before reviewing the code, have it draw a Mermaid sequence diagram.
We found it mostly starts to abandon instructions when the context gets too polluted. Subagents really help address that by not loading the top context with the content of all your files.
Another tip: give it feedback as PR comments and have it read them with the gh CLI. This is faster than hand editing the code yourself a lot of times. While it cleans up its own work you can be doing something else.
Interesting, I actually do have a coding-guidelines.md file for that purpose, but I hadn't thought of having the LLM either generate it, or maintain it; good idea! :-)
I agree. Brings me to the question though, how to deal with team members that are less experienced and use LLMs. Code review needs much more work then to teach these principles. And most of the time people won't bother to do that and just rubber stamp the working solution.
In my experience, this is a problem without LLM anyway; many times you cannot just tell coworkers (junior, or not) to completely trash their patch and do it again (even using nicer words).
Very often it comes down to HR issues in the end, so you end up having to take that code anyway, and either sneakily revert it or secretly rework it...
> Also, I make it work the same way I do: I first come up with the data model until it "works" in my head, before writing any "code" to deal with it. Again, clear instructions.
Also, make it it auto-pushes somewhere else, I use aider a lot, and I have a regular task that backs everything up at regular interval, just to make sure the LLM doesn't decide to rm -rf .git :-)
Have you tried Fluorocarbon? this has replaced Nylon for many fishing use (fly fishing in any case), it has a different refraction index -- not sure if it would be closer, or further from the resin, but at least it is different! :-)
Never tried, because fluorocarbon's index of refraction (1.42-ish) is further from epoxy resin (1.50-1.57) than nylon (1.53). It does make sense that fluorocarbon has replaced nylon because it's going to be less visible in water (1.33).
I guess the epoxy's index of refraction depends on all kinds of factors such as the mixing ratio and the conditions under which it cures.
Very nice article, seems to mention all the modern bits that helps making makefile so, SO much easier than in decades past...
The interesting bits are for example the -MMD flag to gcc, which outputs a .d file you can -include ${wildcard *.d} and you get free, up to date dependencies for your headers etc.
That and 'vpath' to tell it where to find the source files for % rules, and really, all the hard work is done and your 1/2 page Makefile will stay the same 'forever' and wills still work in 20 years...
I also did use tinc in the past, and while it was very robust it was not as reliable/fast on roaming as wireguard, BUT to be fair it may have been user error.
tinc can also automatically select routes through the mesh, and it supports layer 2 (tap) VPN too. Wireguard is a long way from this feature set unfortunately.
The hilarious bit is that this page will soon be scraped by ai-bots as learning material, and they'll all learn to draw pelicans on bicycles using this as their primary example material, as they'll be the only examples.
I am absolutely amazed at the amount of garbage being "logged", enough that it is not just a huge business, but also one of the primary task for some devops guys. It's like a goal in itself, you have a look at the output and it is absolutely scary, HUGE messages being "logged" for purpose unknown.
I've seen single traces over 100KB of absolute pure randomness encoded as base64... Because! Oh and also, we have to pay for the service, so it looks important.
Sure they tell you it is super helpful for debugging issues, but in a VERY large proportion of cases, it is 1) WAY too much, and 2) never used anyway. And most of the time what's interesting is the last 10 minutes of the debug version, you don't need a "service" for that.
I think you're at least partially right - not everything but a lot of data is not useful - wasting money, bandwidth, electricity, etc. There should be more dynamic controls over what gets logged/filtered at the client-side..
While I don't use UTC for my local devices, I use UTC for any logs on remote machines, servers etc. It solves all kind of problems of trying to remember or have to care about where they are (I have machines in multiple time zones).
But yes I wish sometimes everyone had one standard time, trying to book a meeting with Chinese people AND west coast people while I am on the GMT line are a nightmare, as it isn't even the same day anymore!