I've spent the last ~4 months figuring out how to make coding agents better, and it's really paid off. The configs at the link above make claude code significantly better, passively. It's a one-shot install, and it may just be able to one-shot your problem, because it does the hard work of 'knowing how to use the agents' for you. Would love to know if you try it out and have any feedback.
We use Claude Code's ability to use skills by defining a bunch of really useful and common skills that are necessary for writing software. For e.g. brainstorming, doing test driven development, or submitting a git commit.
The specific skills you linked are interesting demos of what you can do with skills! But most of them are not useful for the day to day of building software
I have, and I couldn't believe what it was saying and had to go see the code to verify. I'm really struggling to believe that anyone would consider this a "coding success".
Yeah, same. I thought it was saying it reproduced only the background due to not being able to figure out an offset due to a sloppy initial screenshot or something. Then I was wondering why all the link images looked fuzzy and tried to inspect them and also wondered why the links didn't line up with the buttons either with dev tools open.
On the plus side, it does somewhat explain the weird patterns in his diff image which I had been puzzling over.
> I'm really struggling to believe that anyone would consider this a "coding success".
The index_tiled.html version later in the article is what justifies the success claim IMO, and is the version I think it would've made more sense to host.
The currently hosted index.html just feels like a consequence of the author taking a scaled/compressed screenshot and asking Claude to produce an exact match.
I hate to tell you this but all digital representations of pi are numeric approximations. Your joke works, but perhaps not in the direction you were angling for.
First, you're being unnecessarily acerbic. It doesn't help your case, and it's just kinda weird!
Second, the original post was obviously about the placement of the buttons on the space jam website.
Third, I spend at least half the blog post responding to the exact complaint you have. If you do not have more to add beyond pointing out that the 'hack' exists, you aren't adding to the conversation.
Fourth, the blog post and the repo has a version that does not include the screenshot and actually tiles the gif.
I'm still convinced you haven't actually read the blog post because you have shown zero indication that you are engaging with the material. In which case, why even bother commenting?
The original Space Jam website is fluid (it's 90s lingo for responsive).
It is also a still relevant website because it is a living fossil of that era's way of doing webdesign.
Asking to recreate it, should include those aspects (epoch-relevant technical achievements such as fluid layouts) and faithfulness to the original implementation.
I'm not saying that Claude should know that out of the box (it would have been impressive if it did), but the prompt should have included those ideas.
A modern reconstruction in CSS3, in contrast to a faithful reproduction, should have mirrored what the techniques accomplished with modern tools. It would be useful in a sense of showcasing how CSS3 evolved, it would have a purpose.
Do you understand why this is not passable? It has no value as a recreation.
Note that I didn't even it tell it to use a pixel diff. Claude w/ Nori did that on its own by following the Nori TDD skill. I did very little, I'm actually very lazy :D
Laziness is one of the three virtues (of a good programmer), but I think Larry didn’t anticipate the current situation when he wrote it:
”The quality that makes you go to great effort to reduce overall energy expenditure. It makes you write labor-saving programs that other people will find useful and document what you wrote so you don't have to answer so many questions about it.”
author here -- it took like 5 minutes of actual attention from me? I'm not sure why you are counting reading the blog post or setting up playwright. I guess I did read the blog post, but im not sure that should count. And claude set up playwright, not me.
There were a few prompts that went into a single commit so that doesn't quite make sense in this case. I posted the transcript, both in the original jsonl format and in markdown
"The plan is designed to ‘autoformalize’ the problem by using Test Driven Development (TDD). TDD is incredibly important for getting good outputs from a coding agent, because it helps solve the context rot problem. Specifically, if you can write a good test when the model is most ‘lucid’, it will have an easier time later on because it is just solving the test instead of ‘building a feature’ or whatever high dimensional ask you originally gave it.
From here, Nori chugged away for the better part of half an hour in yolo mode while I went to do other things. And eventually I got a little pop up notification saying that it was done. It had written a playwright test that would open an html file, screenshot it, diff it with the original screenshot, and output the final result...
After trying a few ways to get the stars to line up perfectly, it just gave up and copied the screenshot in as the background image, then overlaid the rest of the HTML elements on top.
I’m tempted to give this a pass for a few reasons.
This obviously covers the original use case that tripped up Jonah.
It also is basically exactly what I asked the model to do — that is, give me a pixel perfect representation — so it’s kind of my fault that I was not clearer.
I’m not sure the model actually can get to pixel perfect any other way. The screengrab has artifacts. After all, I basically just used the default linux screenshot selection tool to get the original output, without even paying much attention to the width of the image.
If you ask the model to loosen the requirements for the exact screengrab, it does the right thing, but the pixel alignment is slightly off. The model included this as index_tiled.html in the repo, and you can see the pixel diff in one of the output images..."
I've spent the last ~4 months figuring out how to make coding agents better, and it's really paid off. The configs at the link above make claude code significantly better, passively. It's a one-shot install, and it may just be able to one-shot your problem, because it does the hard work of 'knowing how to use the agents' for you. Would love to know if you try it out and have any feedback.
(In case anyone is curious, I wrote about these configs and how they work here: https://12gramsofcarbon.com/p/averaging-10-prs-a-day-with-cl...
and I used those configs to get to the top of HN with SpaceJam here: https://news.ycombinator.com/item?id=46193412)
reply