Embedded systems often have crappy compilers. And you sometimes have to pay craz...

toast0 · 2025-02-24T06:37:29 1740379049

At least the debugger worked. The processor I used in embedded systems in college, the 68HC11, would stop doing conditional branches when the supply voltage was too low.

We had a battery powered board, with no brownout detection, and I was using rechargable NiMH batteries to save money/waste. When the students with alkaline batteries had low batteries, the motor load would bring vcc down far enough that the CPU would reset by itself. With NiMH, the batteries could still drive the motors and keep the CPU alive...

You could single step in the debugger, and see the flag register was set as expected, but the branch didn't happen. Just ran straight through. I can't remember if unconditional jump or call worked. After about the third time this happened, I got good at figuring it out.

apple1417 · 2025-02-24T11:04:32 1740395072

> For regular programmers, if your machine won't boot up, you are having a bad day. For embedded developers, that's just a typical Tuesday, and your only debugging option may be staring at the code and thinking hard.

Of course where it becomes even more fun is when it's a customer's unit in Peru and you can't replicate it locally :). But oh how I love it. I have definitely spent many a day staring at code piecing things together with what limited info we have.

But to get back on topic, I can definitely confer on the quality of most embedded compilers. It's a great day when I can just use normal old gcc. I've never run into anything explicitly wrong, but I see so many bits of weird codegen or missed optimisations that I keep the disassembly view open permanently, as a sanity check. The assembly never lies to you - until you find a silicon bug at least.

anitil · 2025-02-24T05:11:09 1740373869

> For embedded developers, that's just a typical Tuesday

I was trying to explain to my colleague the other day that I've spent an unhealthy amount of time rebooting devices while staring at an LED wondering why it won't turn on.

eschneider · 2025-02-24T11:06:46 1740395206

Tuesday, indeed. :)

In the embedded world, correctly working hardware isn't a given, either. Part of the board bringup/hardware verification process is just determining that everything on the board actually works. Always fun when you have to figure out if a problem is in your code or in the hardware. (HINT: It's often both.)

It's rare that you need to break out the oscilloscope or logic analyzer, but when you absolutely have to know if that line went high or not, there's no substitute. :)

taneq · 2025-02-25T04:58:49 1740459529

> (HINT: It's often both.)

Or worse, it’s neither! By which I mean both. Neither part of the design is technically wrong but the fault is in the way the two interact. Those are some of the fun ones… I had one where I had to make sure the chip select line was off before turning power off to a chip, because CS would keep it half powered.

eschneider · 2025-03-02T18:36:44 1740940604

At a sufficiently high resolution, all digital electronics is actually analog. :/

sitkack · 2025-02-24T05:21:10 1740374470

It is nuts to have a dev board that is constrained as the final device. You should have had an additional serial port and 8x as much flash, it would have solved your problem immediately.

It is even better to do the bulk of the dev inside of an emulator if you can swing it. The GPS and GPRS could be tethered into the emulator instead of trying to get a debug link into the system board.

ShroudedNight · 2025-02-24T04:01:42 1740369702

Were these commodity boards? Having to resort to using the cellular connection, instead of attaching a hardware debugging probe (J-Link?) seems like a recipe for a painful squandering of intellect.

exmadscientist · 2025-02-24T06:12:47 1740377567

One of the lovely "features" of embedded work is that after a while of doing this sort of thing, sometimes you get good enough at the crazy hacks that it becomes faster and easier to do something like this than to track down who has the J-Link (okay, they've usually got more than one) and can they spare it/where did they put it/why does that person have a J-Link at all/is the J-Link still alive....

jamesfinlayson · 2025-02-24T23:43:26 1740440606

Oof, I remember doing lots of embedded stuff at university and this rings true.

The compiler we used was built off gcc so it was reasonably good but I remember we had some weird crash one day that I couldn't figure out. Eventually I added some inline assembly to do an absolute jump to the next place that it needed to go and it started working again. I was too inexperienced to know how to dig deeper but presumably the code generator had inserted something weird that was causing a crash.

lisper · 2025-02-24T17:15:29 1740417329

Yeah, I have a war story...

I was working on mobile robot research at JPL back in the 1990s. We had a robot with an arm attached. It worked fine except that every now and then the whole system would crash hard with a totally corrupted heap and stack, just random data everywhere. So no chance of a backtrace. The really weird thing was that this only happened when the arm was moving. We also had the exact same system running under a different operating system and we never had any problems there, so we were 100% sure it was not a compiler error.

It was a compiler error.

It took us a year to figure out what was going on. It turned out that the compiler had a bug where it would emit code that would pop the stack pointer and then pull a value out of the now unprotected stack frame. On the non-embedded system this did not cause any problems, but on the embedded system (running vxWorks) hardware interrupts used the same stack as the process that was running when the interrupt hit. So if we happened to get an interrupt just after the stack pointer was popped but before the unprotected value was grabbed, that value would get stomped on by the interrupt handler. Then when the interrupt handler would return, the process would resume, grab the now-random value, and chaos ensued.

ShroudedNight · 2025-02-24T23:20:05 1740439205

How many novel depressions were created as a result of high velocity impacts after making that discovery? I think I'd be seeing red...

lisper · 2025-02-24T23:28:29 1740439709

Actually, I remember being thrilled to have finally figured it out. We had been beating our heads against the wall (metaphorically) for a year, and I remember looking at the screen at the disassembly sequence and thinking, Oh my God, I think I've found it! It felt like making a major scientific discovery. (To be fair, I was only able to do this after others laid the groundwork for me by finding ways to reliably reproduce the problem. But I'm the one who spent hours single-stepping through assembly code before finally realizing what was happening.)

I also remember reporting the problem to one of the authors of the compiler (I think it was David Kranz) so he could fix it in the next version and him telling me that there wasn't going to be a next version because the funding for the project had been cut. There was no github in those days so the whole thing just faded into the mists of time, which is a real shame because the system really kicked ass.

The whole history of the project can be found here:

https://paulgraham.com/thist.html

motorest · 2025-02-24T06:34:34 1740378874

> For regular programmers, if your machine won't boot up, you are having a bad day. For embedded developers, that's just a typical Tuesday, and your only debugging option may be staring at the code and thinking hard.

It seems to me that if you can still update and reboot said machine, you can do a bisect on your commits to pinpoint the regression. Once you spot the regression commit you can split it to check what introduced the regression.

smcl · 2025-02-24T07:33:28 1740382408

It took them multiple tries just to use gdb, I don’t think this is a scenario where you can easily reflash the image on the board

stuaxo · 2025-02-24T08:00:33 1740384033

Did the GCC patch get applied after that?

actionfromafar · 2025-02-24T11:13:52 1740395632

"Never" implies no, I guess. :-)