On what setting in which environment do you run it? I use the VSCode extension on Extra High and feel like it does exactly what needs to be done and stops when the thing I asked for is done. Extra comments come only when they fall into the area of code that was changed.
I tested it to fix React Native bugs in a project, comparing it with Opus. It fared better on harder bugs, taking less time to find the root cause, but after implementing a fix, it spent a lot of time and effort on validation. This was mostly unnecessary, since most of the bugs were in the JS code, so for most things, hot reloading is enough for E2E validation and to run just the right tests. No need to run a full build and test suite (which takes 10+ minutes); the CI can do this.
I switched back to Opus because of this validation quirk. Overall, Fable spent 20% of the time on coding and 80% on validation.
I think using Fable for planning and Opus for execution could be a "best of both worlds" approach (I need to test this more), but for most cases, it's not necessary, and Opus is enough.
> most of the bugs were in the JS code, so for most things, hot reloading is enough for E2E validation and to run just the right tests. No need to run a full build and test suite (which takes 10+ minutes); the CI can do this.
Have you tried adding this instruction to your agents.MD? Avoiding situations were the agent start running a loop is the main use case of the file for me
On my (admittedly weird) setup, GPT-5.5 Pro times out.
The reading is off because the thermistor resistance also depends on applied voltage, not just temperature. LLMs couldn't get this even after feeding them multimeter voltage readings, not just ADC readings. They went into guessing much more esoteric things like ADC switched-capacitor input current, burnout-detect current sources or IDACs left enabled, board leakage, leaky cap, etc.
This is the kind of problem I expect Claude to be useless at, and while I could see Gemini Deep Think making a good showing, I'd only bother with ChatGPT Pro. FWIW, I do believe it got the correct answer as one of its first two suggestions (though I am not an electrical engineer, so maybe I am not understanding this given the vague/summarized prompt).
Yes, the core supports exact rationals. This is easier to deal with in formal verification than floating point.
I made the UI snap to a fixed precision, such that its easy to reproduce special cases with overlapping edges, coinciding vertices etc. that make up much of the complexity of the algorithm.
In a past life i tried to implement Delaunay triangulation in floating point for data that can come in a rotated square grid. Normal precision doesn't work in that case. I learned a lot about arbitrary precision numbers doing that. The question about floats here gave me flashbacks.
I am eager for a lean equivalent of flocq in rocq. When I did some lean verification of numerical algorithms I did the same thing with rationals or the reals from mathlib. The big gap between that and the actual code is the lack of a solid theory library to pull in that would give me IEEE floats that is at the same level of quality as Flocq. I’m eager for that to come along (unless it has and I just haven’t found it yet).
I think to do efficient formally verified geometry with floating point we would also need something like Shewchuk robust predicates. (I worked with them in the past to write robust software that is not formally verified. Did not read up, if there is a formally verified library for them.) Shewchuk robust predicates give certain consistency guarantees that are nice to have when implementing computational geometry with floating points and I think can be formalized.
I'd argue that this is an adjustment period that society has to go through. The way we are using electronic devices today, in some years it will probably be looked at like smoking cigarettes. And I'd argue that a lot of the "decline" is due to a shift of skills away from things that mattered more in the past toward other things that are not measured/perceived by the older generation.
Interesting analogy. I believe regarding addictiveness they may be compared.
> a shift of skills away from things that mattered more in the past toward other things that are not measured/perceived by the older generation.
Do you have any ideas what these things might be? As someone in his twenties, I’m sometimes saddened by observing that some of the skills I acquired over a long time (e.g., writing, coding) may become obsolete or won’t be respected anymore just now that I‘m finally getting good at them.
it happens, things change and the change is only speeding up. I think the real skill to have going forward is the ability to acquire new skills. I tell my boys "get good at learning and you don't have to get good at anything else".
Ages ago I had similar thoughts. Everything changed when I came to terms with the concept of change being the only constant. A bit of a cliché, perhaps, but profoundly true.
> I'd argue that this is an adjustment period that society has to go through.
I used to think like this until social media proved there are some tech innovations we just can’t adjust to. 10 years ago you would’ve never caught me supporting any sort of age based social media ban. Now? I don’t think it goes far enough. Fake news (actual fake news) and misinformation has only gotten worse with it as well. It’s so destructive.
The human is designed to interact with small groups, to understand several smaller groups, and perhaps to imagine a big group of smaller groups. In a literal sense, let's say 100 people per group. At that level the human can actually know and interact with them still. In a city of 100.000 it's still managable to feel you are related and involved to this group-of-groups. In a city of a million, you'll revert to only your own small group and have lost the connection to the collective.
The same goes for speed and quantity of input, as to what the human is designed for (not literally designed). Be it social media with it's infinite scrolling, cars racing by as opposed to looking out the window a few times per hour because you see someone/something, constant sound input if you live anywhere remotely busy or work in a busy office.
The point I'm trying to make is that the world used to be comprehensible for the human. Some understood a little complexer things, some only the simpler things. Now there is an overload of everything. So, most humans are in survival mode wether they know it or not. Hence the many seekin mindfullness etc
No matter, it's an observation, not a judgement or opinion on it. The world will just keep rushing forward. Some have a slight hand in the direction it goes for better (never) or for worse, but spiral it will.
I think there’s a major part of this conversation being omitted, though I am not saying you did it intentionally: “the attention economy.” We have gone from advertising to a system of creating addicts for profit
Definately agree, that was included in the "or for worse" in this sentence I wrote. As for creating the addicts, nobody had a masterplan. It's all the pieces spiraling together,
>> The world will just keep rushing forward. Some have a slight hand in the direction it goes for better (never) or for worse, but spiral it will.
The systems are too large and self-propulsing for anyone to really control. Consider the rainforest. How many millions of variables interact, nobody is in charge, everything influences everything in a billion different ways. You might say, well we can cut it down, so kind we can control it. Allright, let's continue to spiral. You might build a city there after a few years. Still in charge right. But it get's too hot because there's no vegitation, so you have to change again. And then we find that people keep getting strangely sick, and scientists find some special mushroom that survived and apparantly thrives on the mix of cut trees and diesel fumes and their spores in the air are poisonous. I made that up, but you get the idea hopefully.
Eh, I think it's less like a cigarette and more like the car. We're not going back. Americans are famously less healthy the more car dependent they are, and now people walk/run as an explicit task to be healthy. People will start going to a "thinking" gym, or engaging in additional manual mental activities for sport, like we do with chess today.
The Internet in no way made memory obsolete. People who know things off the top of their heads are far more capable than people who have to look things up.
This is an age-old argument actually, the same one was raised when the printing press was invented and reading became a more generally available skill.
Funny that you mention that. A month ago I started the Duolingo chess course, and just yesterday I noticed that my brain is clearer, more capable of deep thought than it has been in years. It's like stepping out of a fog. I also started CPAP recently, so it's hard to attribute the change to either, but I feel certain that the chess helped.
The interesting thing about jogging is I do my best thinking while jogging. I've found it impossible to do deep thinking while driving, as driving evidently requires higher functions of the brain. Jogging doesn't require any of that, I can jog deep in thought and have no recollection of the previous mile.
Apparently Petri was driving when he figured out a new fingering for a Chopin Etude he'd use at a concert later that evening[1]. The passenger survived to tell that story, so I think this is more about your correct risk-assessment than physical limitations of the brain. :)
1: Just to unzip: "I don't like this fingering; let me imagine playing with this other fingering; yep, that will feel more comfortable and stronger; I'm now confident enough in the change that I'll do it that way in public, for one of the most difficult piano pieces, without ever having practiced it physically..."
The idea that most people have the discipline to keep themselves mentally in check is false. We already know this! Millions and billions of people who spend hrs a day consuming media on platforms such as instagram.
While you are right in a way, I think you miss the point. In the past "computer" was a job description and mechanical power came from serfs. They surely developed skills we are lacking today but I'd argue that overall the world is a better place with digital computers and electrical motors. It frees up these people to do something else, something of higher value.
Sure, the world is a better place with fewer serfs in it, but what exactly is of "higher value" than being a research mathematician? It's already a profession that consists essentially of exercising our highest and most distinctly human capacities: creativity, abstract reasoning, and passing the results of those on through a distinctive language and culture. I don't think the comparison with serfs is useful.
I'm sure most research mathematicians would like more freedom from some of the drudgery of their work (grading, admin, etc.), just like the rest of us. But we should be aiming for a world that allows more people to become mathematicians, not fewer.
We argued that AI would free us to explore the arts. Instead it first came for written language and images. So what's left when it can write all the programs, drive all the cars, and AI sensors on farms can monitor and distribute nutrients. I remember watching TED Talks about how AI weapons need to be carefully studied, and instead we see them autonomously picking targets. I'm not seeing any higher values, instead I'm seeing how we're on a path to assured destruction.
I see that point of view but there's another that I've recently been thinking about.
Many of the fields that were traditionally considered for "smart" people (STEM etc.) are the ones that are being really hammered by AI. Whereas, things which people considered lightweight often involving social relationships and interpersonal skills are still beyond the scope of AI (much of it even theoretically beyond the scope although perhaps robots might have an effect there).
There used to be a sysad T-shirt from the BOFH days "Go away or I'll replace you with a very small shell script" which pushed the idea that whatever could be replaced by a computer was something trivial. Now we find that the things which we thought were only for "smart people" are the very things being replaced by computer programs which is telling. Perhaps what we considered tough and smart really wasn't.
This is actually a very old AI insight, acknowledged at least as early as the 80s, let me see if I can find the quote.
Found it:
> Rodney Brooks explains that, according to early AI research, intelligence was "best characterized as the things that highly educated male scientists found challenging", such as chess, symbolic integration, proving mathematical theorems and solving complicated word algebra problems. "The things that children of four or five years could do effortlessly, such as visually distinguishing between a coffee cup and a chair, or walking around on two legs, or finding their way from their bedroom to the living room were not thought of as activities requiring intelligence. Nor were any aesthetic judgments included in the repertoire of intelligence-based skills.
> "it is comparatively easy to make computers exhibit adult level performance on intelligence tests or playing checkers, and difficult or impossible to give them the skills of a one-year-old when it comes to perception and mobility."
The things like proving complicated theorems are things that are acquired by education within a lifetime, and that's why they're easy for AI.
The things a child can do are acquired through millions of years of evolution. While they don't require much explicit education, that doesn't mean they're easier.
Fair enough but even thing acquired within a lifetime have a hierarchy. Many societies, for example, assume that the kids who are good in Math are smart but the ones who write well or are exemplary in "co-curricular" subjects simply aren't that bright.
As an example, the kid who can solve Math problems has less of an edge over AI than the kid who automatically becomes the captain of the neighbourhood football team but older human beings often assume that the former is smarter.
I've always found that weird, do people really use plungers for that?
The toilet brush is a much better tool for unclogging the average toilet.
The plunger is actually meant to unclog sinks as far as I can tell, since it can attach much better to the sink and through its action can create pressure to unclog the much smaller sink drain pipe.
If the clog involves toilet paper, I'd rather not put a brush in that.
Here's how I use a plunger effectively:
Submerge it and then angle it to swap out some air for liquid, so you have more mass to push into the pipe.
Tip it back upright, then slowly push down, relax and let the bell fill back up with water, and repeat, finding a resonant frequency where the pushed water doesn't just jet out the sides (due to imperfect seal) but because there's a pressure-wave action the clog gets moved in and out repeatedly until it breaks down enough for water to scoot by.
Then one more flush to clean the plunger.
> Brooks is weirdly sexist, but it's unsurprising that (higher) "intelligence" should mean things that are hard, not things that are easy.
Way to miss the mark (and also shift the discussion to woke conversation points on a comment from 4+ decades ago).
The point of his entire comment is that it seems like the "hard things" (aka abstract science) will be a lot harder for AI than "easy things" (a 5 year old or a dog understanding their environment in great detail, from depth perception to smells, sounds, etc, etc).
Your comment looks like it was written by exactly the kind of man Brooks was mocking.
Picking vegetables is still really tough for robots.
Pick and place robots, or humanoid robots that can fold laundry, are still a lot tougher than automating knowledge workers and a lot more expensive to the point it's questionable if they're worth it.
We may not be on a path to assured destruction, we may be on a path to becoming livestock.
When I was a child, I lived in a neighborhood. Every week a garbage truck picked up the house hold trash.
5 guys were on that truck. 1 driver and 4 guys that actually lifted up various shaped trash cans and dumped them into the truck.
Today I live in an apartment complex. 100 families take their trash to the compactor. 1 guy in a garbage truck comes once a week to collect the compacted refuse.
I wonder what happened to the other 4 guys. 80% of the garbage collecting labor… freed up to do something of higher value.
On the contrary, only your sarcasm was thick, not the substance behind it. Kind of what I'm yapping about if you'd be kind enough to notice.
What you propose makes fine rhetorical sense, and I can assure you it did reach me, it's just that a (very) cursory search yielded me no significant employment rate changes or drastic layoffs in the related sector over the decades. Instead, it suggested that people have been reshuffled to do waste sorting and other related activities rather, and that the field actually grew, directly contradicting your smug, sarcasm-laden, willfully demagogue framing. Traits that are not exactly the hallmark of epistemic rigor to begin with, nor do they further it, even if the given narrative did hold up.
It's so easy to be asinine and make up a story, especially when one feels morally justified in doing so, and considers the base facts & analogy to be "obviously correct". I don't think that setting people up for failure by feeding them correct sounding lies - or sending related discourse into a nosedive in quality just to get in some cheap zingers - would help the cause a whole lot though. Do you?
Put differently, it helps if the example provided actually holds up as an example for the topic discussed. Especially if that example is as dramatic as 80% of a given job disappearing, and the people involved just plain losing their livelihood supposedly.
> In the past "computer" was a job description and mechanical power came from serfs.
Serfs, all right, but in what world do you live where "computers", people who did manual computing (i.e. mechanical additions/multiplications/... with very large numbers) are the same as actual research mathematicians, who are basically pure logicians?
The only perspective where it makes sense to root for mathematicians to go away is if you're a misandrist that thinks humanity should be replaced by robots (for reasons...). Or isn't logic something that's a defining human trait, and one of the main reasons we became the dominant species on the planet?
I don't think that "root[ing] for mathematicians to go away" is the problem. The problem (if there is one) is that the process by which that occurs is economically determined. No amount of complaining will stop AI from being useful in mathematics or erase the incentives to make it better. It's automatic process, like photography sidelining painting or shoe factories sidelining cobblers. We go through this with every technological advance and the outcomes are pretty much determined. No cheerleaders are needed.
This is already done as much as possible by reordering and merging operations but transposition (explicit or implicit) is unavoidable for some operations.
I haven't really shared what I use, I'm still deciding if that's something I want to do.
To get an idea of what I'm talking about, you could install https://github.com/obra/superpowers/ into both Codex and Claude Code -- You'll find that the behavior is remarkably similar if you A/B compare them on the same problems. CC occasionally misses things that Codex gets and vice versa.
Overall the output structure and final code is remarkably similar... Which is pretty different than if you just run them with their default system prompts. I'd throw codex out the window with its default outputs.
reply