Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
24-core CPU and I can’t move my mouse (2017) (randomascii.wordpress.com)
214 points by thunderbong on Dec 22, 2022 | hide | past | favorite | 253 comments


I use Thunderbird for my email. Its behavior shows it is a multithreaded program.

But often, the mouse and keyboard will freeze and it becomes unresponsive for several seconds. This is indicative of suboptimal partitioning of the tasks into threads. The highest priority thread should be responding to user input.

Heck, back in the 1970s, I designed and built a single board computer that was to be a glass tty. There was no way to get that 6800 uP to update the screen fast enough to keep up with characters arriving at 9600 baud.

The solution was, whenever the user hit a key, to abandon updating the screen and process the character. Once that process was complete, and there were no more keys in the input, the screen updating was restarted.

It worked out great. You simply never noticed this was happening, and you always got crisp response.

I did the same thing for the MicroEmacs text editor on the IBM PC. If I hadn't, the editor would lose input and/or have noticeable lags refreshing the screen.

P.S. The importance of not losing input was necessitated by using ttys to transfer files across the phone lines and serial ports. You also couldn't touch type if the tty lost key input.

P.P.S. The Chrome browser that I use also has problems with freezing on keyboard input.


For me the UI hangs whenever Thunderbird downloads new messages. It seems as if they are doing network I/O on the UI thread - which can't be true...


I learned not to touch the keyboard when TB was downloading new messages and running filters. Otherwise it would randomly corrupt the message file, and I'd lose it all. Yes, I filed a bug report with TB. No, they never found the problem.


I stopped using Thunderbird a long time ago because it had a “feature” that would erase all of your email if it detected that your profile was corrupt…

… but it has a habit of corrupting profile files on shutdown.


If all other explanations fail then even the least likeliest explanation is very likely true.


After all recommended approaches have been tried unsuccessfully, certainly the solution will be something that is not recommended.


I stopped using thunderbird for this reason


even by mid-80s, if you used the naive serial routines built into the Apple IIe you couldn't do 1200bps- 300bps worked. Professional terminal programs used interrupt and buffers to capture incoming bytes while the screen scrolled (https://sites.google.com/site/drjohnbmatthews/apple2/ssc?pli...). At least that's what the author of ProTERM told me when I was a kid and I asked how it worked. It sounded pretty exotic at the time but now I work with low-level hardware... $5 chip that can do 2Mbps over serial.


Yup, ISRs are the solution for 6800 and 6502 chips.

Data I/O's LogicPak (powered by a 6502) used polling, and would now and then lose data because of that. I advised the engineer working on it to just write an ISR. Months went by, while he was convinced he could make the polling work.

Finally, the manager dropped by and asked if I could just fix it. In a couple hours had it fixed with an ISR. No more problems.


The Amiga's built in serial port was particularly atrocious as the hardware could only buffer 1 character at a time. This meant that it did not work particularly well at baud rates above 9600. Combined with the CPU being starved from accessing RAM when running in hires 16 colour modes (used when emulating ANSI "graphics" common at BBSes back then), this made it rather unpleasant as a terminal emulator.


The Amiga supported both chip and fast RAM, precisely to solve the problem you're talking about. What you mention here was at most an issue with your particular machine, not a limit of the platform itself.


No, that doesn't solve the problem -- I had a few megabytes of fast RAM, and it still sucked. The chipset bus is completely contended in 4 bitplane high res graphics in OCS/ECS, so any attempt to access the serial port registers is blocked until the horizontal blanking interval (as is chip or slow RAM). The faster and wider data path in the AGA chipset improved this substantially, but that data path was only used for video DMA, not for the blitter or other chipset register accesses. The only real solution was to use an aftermarket serial port card that wasn't on the chipset bus and used a real UART like the 16550 that had a 16 byte FIFO.


I can crash Linux(all tested distros >5 for at leat 3 years) and FreeBSD reliable with Thunderbird, it's a terrible application and a statement for GUI-oss-applications....pure shame....

Edit: and no, not with root or a "wheel"-User


please provide a step by step guide on how to do this


Having 1000's of Mails in TB (maildir or mailbox) in at least 5 accounts, then right click on a big folder and click "repair", when it starts to login (before synchronization of header starts) remove the Ethernet cable (no other connection shall be active) click around in TB in a frenzy manner (go "offline", try to repair again, open menus etc), and see your TB freeze then your X11 session, sometimes if your lucky the system crashes (works best with intel GPU's..the crash i mean).

The slower your imap the faster the freeze, Gmail works best.


Not with TB specifically, but some years ago I was able to consistently crash both Windows and Linux by filling up the disk drive nearly to the brim, and then running various applications. It seems that not many programs consistently check disk writes for failures, or if they do, they don't shut down gracefully.


Works fine for me, for the last 10+ years. Fedora and Thunderbird.


It just shows that Thunderbird has room for improvement when it comes to how it uses multi-threading. You always want the UI thread to be non-blocking and react instantly when a user does something. Most games (for example) do this really well.


This is one of the primary resaons I ditched Windows for Macos.

The multitasking on Windows is ridiculously bad.

Also, applications that freeze and can't be killed/restarted are simply part of life on Windows.

Add to all that, Window slows down over time - again, the periodic reformat and rebuild is part of life on Windows. After I switched to Mac I didn't need to rebuild the operating system essentially ever.

Also, the registry - I think the worst idea in all computing. Prior to the registry you plopped a windows application in a location and configured it with an INI file. After the registry the entire operating system and all applications turned into one big ball of chewing gum, glue and hair dredged out of the shower drain.

How can Windows have got it so wrong?


Anecdotally, that's not my experience with Windows these days at all. I've never had the whole system freeze up on my Win10 desktop. Never noticed a slowdown or app that can't be killed, and never done a reformat or rebuild. Windows Update is mildly annoying when it decides to reboot your system for you, though usually when you're not using it. The registry seems fine to me too. Okay it's not perfect, but I don't think it's meaningfully worse than the MacOS or Linux system of configs in text files with various types of special formats all over the place, unless of course it's some kind of binary format that can only be modified with a special CLI program.


I agree. Windows is basically fine these days. It's even better than Mac and Linux in a big way from a desktop stability point of view because it has the Ctrl-alt-delete interface which works reliably and lets you kill the offending program.

Linux has various useless options like sysreq shortcuts that you can't remember and kill random processes, and I don't think Mac has anything, though to be honest I don't recall ever bringing a Mac to its knees so much that I couldn't open a terminal and run `top`.


I wish Ctrl+Alt+Delete would be reliable enough to work with full-screen games. But, half the time, when you try to launch Task Manager that way to kill the offending process, it ends up under the game's output. I've learned to kill processes using keyboard navigation by looking at Task Manager's thumbnail in Alt-Tab process list, but it's ridiculous that we still need to do this kind of dance in 2023.


There's a way to make it always appear above other windows. Let me see if I can find that again...

Ah, it's in the options menu on the task manager itself (alt + O to access the menu without a mouse).


why would you ever want that turned off??


The problem is that it doesn't work reliably when a fullscreen game that is currently using the screen exclusively for itself suddenly crashes.


On MacOS the Force Quit Applications can be launched with Cmd-Option-Esc similar to how on Windows Task Manager can be launched with Ctrl-Shift-Esc.


In my experience, on macOS the Cmd-Opt-Esc force quit dialog works pretty well most of the time. Of course `killall "Foo"` in the terminal works nicely too.


Until the advent of the new screen manager, you could always reliably get to a VT terminal.

Full io access and unlimited kill power, far better than ctrl alt delete...

Any regressions in this area I blame on systemd, which I think runs on cpp, so you know I blame cpp... (kernel is written in c, windows got a bit better when they used c# instead of cpp, Mac never used cpp...everybody kill cpp, quick)


Systemd is certainly not implemented in c++. I don’t think many of the authors are even well familiarized with the language. Look at the chart at the bottom of the project page.

https://github.com/systemd/systemd


ahem ahem gconf ahem ahem


My windows literally doesn‘t boot today due to a bluescreen. No idea why or how that happened. Had it a few times in life before, but never on macos.


I make a full+incremental system disk backup periodically, e.g. once a quarter or two, or when I feel that a set of software that I'm going to install could do something bad, or after big changes which are undesirable to lose in case of failure. It doesn't save everything (important data is in git remotes anyway), but saves me from a reinstall-download-setup-reconfigure hassle. A user folder has a separate automatic and more frequent schedule.

That said, my home PC works since 2018 (buy date) and one at work since around 2016. Only used these backups once to move OS to a new SSD.

Most failures happen when you have no backups, so have at least one to make your system failsafe!


Maybe you should be running in what those git remotes are running in (hint, something Unix probably in a container).


I have multiple irreversible-damage experiences with linux too (not bsods, but think of updates and/or nvidia drivers). Even 18.04 to 22.04 transition on one of my older vpses failed recently despite absolutely default setup. I’m fine with MSYS2 as a unix-like env.


This is a very outdated take on windows.

I’m not sure what “multitasking is bad” means, but I do a heck of a lot of multitasking on windows every day with resource intensive developer tools with no problem, and have been doing so for over a decade.

Apps can be killed via task manager without doing anything special. You might run into an occasional rare bug where explorer freezes so you can’t open task manager, but I can’t recall the last time it happened to me (and you can’t tell me with a straight face the Linux or MacOS are completely free from wild edge case bugs of that sort).

I used to religiously reformat and reinstall windows. It’s been completely unnecessary since at least Windows 7. My personal desktop ran from 2014 to 2021 on the same install, including a Win7 -> Win10 upgrade.

You certainly may not like the registry, but it’s hard to make a compelling argument that it’s been a source of problems since Win7. To be honest I think all of the registry problems I can recall seeing in the last decade were caused by people being sucked in by “registry cleaner” scamware/malware apps claiming to fix non-existent problems.


Same experience here. I work my machine hard. It runs 24x7, hosts many sites, databases, Plex, and I do some pretty heavy development work on it daily. It pretty much "just works". The last time I had to routinely reinstall Windows was Windows XP. Ever since Windows 7 driver stability is significantly better. Despite all the bad press, the Windows 11 upgrade was smooth for me. Yes, the OS has some annoying features, but all in all it's been remarkably stable.


Try to sort by name in task manager and wait as it locks up for seconds.


I just did and it was instantaneous. How many tasks are you running?


I'll have to check when I get home this weekend. Can't be that many. Biggest thing would be my 50 chrome tabs :)


(I also have 64gb of ram and 16 threads...)


Seems like something wonky with your Windows setup. Too much memory and threads? I've "only" got 20GB and 8 threads.


396 processes and 6452 threads running.

Sorting in "Processes" tab locks up the UI and my mouse won't even move for seconds. Other apps work fine. Webstorm and Intellij work fine...


Did they? My MacBook is much worse at multitasking. In fact the OS I've found to be best at multitasking is Debian/Ubuntu.


Switching from FF to Chrome (this was like a decade ago, to be fair to FF) eliminated about 50% of my beachballs and other unresponsiveness system-wide, not just when using FF. Switching from Chrome to Safari got rid of almost all the rest, and also made my battery life match Apple's claims.

[EDIT] And when I do still run into trouble, it's almost always Electron chat apps. Slack, Teams, and Discord. All terrible at being respectful of system resources. Closing the program and restarting it usually temporarily fixes the problem, but that shouldn't be necessary.


It's frustrating that efficiency and being a good desktop citizen is such an afterthought for most browsers. Support for all the fancy web features imaginable is isn't worth much to me if means keeping my system pegged and destroying its battery life.


Afaik browsers are highly efficient and a lot of work goes into optimization. On the other hand, they're expected to run arbitrary code (JavaScript, Web Assembly) which can do all matter of terrible things. A couple years ago, it used to be Gmail/Google Voice websites would take nearly 1GB of RAM alone per tab and crash if you left them running a few days.

A website could just run <download big json blob and append to array> in a tight loop and the browser has to try to make decisions on not hosing website performance if the user wants to actually use the crappy website while simultaneously not hosing battery/machine performance.


Blink and Gecko don’t fare as well as WebKit does when it comes to efficiency which would suggest they’re making tradeoffs favoring things other than battery life.


things have improved a lot over time. I have one project involving json arrays that grew way out of hand. Sort of panicing about the need for a fast 80% rewrite while not having 10% time I tested a good number of devices. Crappy androids and low end iphones deal with the many thousands of objects incredibly fast. I still think its wrong I just have nothing to show for it.

ff a decade ago would just randomly die on me within 2 days.


A very unreasonable reason for me not switching to Safari is because the bookmarks dropdowns (when activating the bookmark bar) have two very annoying features. Firstly they only open when clicked, so if you click one and move your mouse to another, the other doesn't open, just the original stays there and you have to click off it to close it. Secondly there's some kind of delay to that click such that if you accidentally click the wrong folder of bookmarks, click off it to the correct folder and click again to open the correct folder, if you click too fast it'll do nothing - no opening the new folder, nothing. And it'll continue to do nothing if you continue to click 'inside' the timeout window. So you have this artifical delay built into bookmark navigation that just put me off.

That and also ultra-wide tabs which change in size as you close them with a middle click, which is really annoying and you sometimes close the wrong tab. Close them with the cross and they stay the same size and neatly collapse, you can even then middle click to close the remainder and they stay the same size. Silly difference in behaviour IMO.

It's the little things that put me off. Shame, as Passkey support is great.


A decade with Ubuntu and I get why it’s free. I have disastrous performance issues with it, for reasons that people usually respond, “oh well that’s not Ubuntu’s fault…” but that isn’t a great excuse as users do not care (unless you aren’t intending to attract general users).

Most of the issue is that I have to fight with nvidia video drivers every month or two. Something happens and they stop operating and it grinds my entire system to a crawl when everything is software rendered. YouTube basically kills my computer.

When I go with Mac or Windows, the main feature is that they have designers to make more than a Potemkin UI, and they care about the end to end UX. No, “oh go complain to some other vendor.”

Ubuntu is very very impressive for free. I certainly acknowledge how great that is, and how important that is to the ecosystem.

Anyways, sorry about the rant. I’m calming down.


I used nvidia with linux >10 years ago. Never again. The FOSS philosophy doesn't mesh well with binary blobs and it never will.


The wild thing is that using the Additional Drivers UI is what bricked my computer last week. It probably downloaded some binaries but it also spent 30 mins running gcc compiling something. And then forced a kernel headers update but didn’t update anything else. Rebooted and I have no wifi or bluetooth.

The fix was a reversion and then CLI install of the nvidia 515 drivers.


Imo Intel graphics work best (not sure about the new dedicated GPUs) if that's powerful enough for you. I've had some issues with amdgpu although no special setup is required. Unfortunately a lot of hardware manufacturers are Windows first (and only) and some poor volunteer has hacked up a Linux driver via reverse engineering

If you want a good Linux experience, you should look for hardware with first class Linux support. Likewise, good luck with macOS on non-Apple hardware--it's possible but ymmv


The trick is to disable automatic updates once you have a working system. Yes, there is some risk. But at least your system will stay as stable as it was when you first installed it. Oh, and disable swap and have plenty of RAM.


32GB + swap off really does help for sure. I do have updates disabled but I admit I am sloppy with not picking through what I update. It doesn’t help that there’s new updates basically daily.


Keep an eye on 'snap' it tends to do nasty stuff without telling you.

You can disable that too.


"Potemkin UI"

Thats a good description for many linux GUIs. They look like they can get the job done, but to really get the job done, you have to use what the developers of said UI likely themself use - the terminal.


A friend used the term when I said, “I used Additional Drivers to update to the latest recommended driver and it bricked my computer.” He said never to use that UI and to do it via a terminal. The terminal approach worked effortlessly and in half the time, once I Un-bricked my laptop.

It feels like a lot of the UIs in Ubuntu are there just so they can claim it’s an OS that can take over for Windows or Mac.


I've since switched to using AMD graphics on my linux machine. Been running from a RX-580->5700XT->6800XT. It's been pretty good while using Fedora.


I've had lots of luck and fun with Fedora, AMD cards and GPU passthrough to VMs. Very stable and fast.


My own personal experience (recent x86 based MBPs and similar) is that macOS degrades quickly when CPU and IO get stressed, but Windows 10 and 11 become unresponsive with much smaller loads. Of the big three, Linux is the one that can take more processes while still being responsive.

Even a lowly old i3 can encode videos with ffmpeg while you browse the web provided you start ffmpeg with a `nice -n 19`. On a Mac, it seems to be ignored.


Linux UI actually becomes very unresponsive under load (on most distros) because the default kernel scheduler is tuned for throughput at the cost of responsiveness, instead of vice versa.

Desktop distros switching to a more user-friendly scheduler, and loading a ‘small speaker’ EQ via Pulse on detecting internal speakers (= laptop) are two massive, low-hanging fruit improvements for Linux that just don’t seem to be done.


I guess running the heavy lifting with high niceness does the trick for me.


The GUI is still essentially single-threaded. Only BeOS with its fully parallel UI had it right.


IRIX remained quite responsive no matter how much I threw at it.


IRIX was remarkable for its performance, and not just in graphics.


Can I ask what model and how much RAM?


2019 13" MBP, 16GB. Pretty much anything I do will spin up the fan now. Compiling an app uses so much battery that I get less than an hour of battery life. I've resorted to remote development only, a la Chromebook.

My previous laptop was a similar Lenovo X1 Carbon with Debian and wow do I miss it, but I don't have too much of a choice since I do mobile app development :(


Yeah, the 2019 MBPs were absolute abominations. I spent 5k AUD on a fully kitted out MBP and performance wise it’s one of the worst computers I’ve ever owned.

I swore that would be the last Apple computer I ever bought, but then they released the M1s… and they are very good.

Would recommend getting an M1 if at all possible. There’s still time to ask Santa for one.


I recommend getting a Framework. It's probably not as good as an M1, but it'll last you more than two years, and if it doesn't, you can just change the CPU for a better one without having to throw all the other, perfectly good hardware away.


    It's probably not as good as an M1, but it'll last you more than two years
I don't disagree with the other positive aspects of the Framework, but my goodness -- where are you getting this idea that a Mac lasts only two years?

6+ years is the norm for me on Macs.


my macs absolutely start to degrade after about 18 months, and I try to keep them limping along for a few more months. I think it’s usually the battery crapping out that causes everything else to sort of overheat and suck


I had my 2012 MacBook Pro for 8 years, before giving it to my dad where it continues to be used (though under far less load than when I had it).

I did get the battery replaced once, it was free. The screen got replaced twice, also for free (the second time they just did it when replacing the battery because the person at the Apple Store noticed a slight wear on the edge anti-reflective coating)


Well, mine admittedly are used primarily for duty so I suppose mine live a rather cushy life - particularly my batteries.

However, I don't think I know how decreased battery life would lead to overheating?

Certainly, dust inside the machine will lead to overheating (or at least, more fan activity) over time. Coats the heat sinks, etc. Perhaps that's it?


> I recommend getting a Framework. It's probably not as good as an M1, but it'll last you more than two years…

Macs have famously-long usable lives — my sister uses a 7-year-old iMac, for example. The latest macOS Ventura supports Macs made in 2017. I'd be very surprised to hear about people using 2021 Framework laptops as their daily driver in 2026.


… this literally doesn't mesh with reality.

I've had 3 MBPs, every single one has had at least one issue, well before 7 years, usually around 1.5 to 2. The first two had battery recalls, the middle one had cable-gate, the middle one's display was also very temperature sensitive (it would have glitched lines artifact on the screen if the ambient temperature wasn't near 70F), the later MBP suffers from keyboard-gate and from constant thermal throttling. (Likely because the vents are choked with dust, but MBP's user hostile design prevents me from opening it up and pushing air through it, which is likely all it requires. They hate the user so much they used screws worse than Torx. I think they're Pentalobe, but don't quote me.)

My current Magic Trackpad is also highly temperature sensitive. The "click" will lock up at high temp. (I.e., the trackpad will fight you, if you attempt to click, if the ambient temperature is warm.)

> I'd be very surprised to hear about people using 2021 Framework laptops as their daily driver in 2026.

I'm using a Lenovo Thinkpad at about that age. (It is a 2017 model, so, 5 years.) The biggest thing wrong with it at present is it requires AC power. (The battery connection is bad. It lived through two bike crashes, though, and I suspect that's a side effect of it. I should see if that's repairable, one of these days, but I've put up with that for the time, as with COVID, it doesn't really travel much anymore.) The TrackPoint™ is also wonky, but I think that's because sunlight has chemically hardened the nib like an old eraser. I have more nibs… somewhere. I should look for them or order more…


It would be absurd to claim that Macs don't fail or need servicing during their usable lives, but you seem to have been particularly unlucky in my experience.

> They hate the user so much they used screws worse than Torx. I think they're Pentalobe, but don't quote me.

Yes, Pentalobe: https://www.ifixit.com/Guide/How+to+clean+your+MacBooks+fan+...


But you're in a thread about how the GP's Mac didn't last two years. I'm fairly sure I will be using my Framework laptop as my daily driver in 2026, maybe with one motherboard replacement. I just switched from my 2017 XPS, and I do development work. I gave it to my dad, who loves it and will probably hold on to it for another few years.

It's a bit odd to be saying this about pre-M1 Macs, as they were "just" Intel machines, same as everything else.


>it'll last you more than two years

An M1 Mac should last double that, easily - so long as you don't underspec it. My family has multiple November 2020 M1 MacBook Airs that are still working good as the day we got them.


Frameworks are nice in theory but they still have a long way to go when it comes to heat, fan noise, and battery life based on what I've seen owners of them say.


Yeah, they are Intel in the end, which is tough on battery and heat. I hope that will get better if they switch to AMD.


This is the biggest reason that I won’t get a Framework today. Give me a Ryzen with 8+ cores, 32GB DDR5, hopefully two m.2 slots, a full-size HDMI and USB-A port(s), 99wh battery because of TSA security theater, dedicated graphics, and some real heat sinks that can handle 200w+ so it doesn’t have to run the fans constantly or cook the touchpad. A 4:3 OLED with at least 4K resolution and 120hz refresh would be perfect on top. Ooh, and a rigid chassis that doesn’t flex like a noodle, and no RGB / gAmER tacky plastic junk all over. It’s sad that a package like this is very rare.


What are you talking about, it's not rare at all. My desktop is exactly like that.


Your desktop has a 99wh battery and you carry it with you on the plane?


Hey, if you want all that, you need to make sacrifices!


I'm definitely eying the Framework the day I don't need to do iOS development.


I have been where you are, posting the same thing, but I don't believe it is true today. Applications that hang and can't be killed is a much bigger problem to me in Linux than in Windows 10. I also don't reformat, at most i reset Windows or applications but it has been years. I agree with the rest but in everyday use the registry is not a problem in Windows.

The difference to me is that I as a power user might be able to fix it myself in Linux but by only using the tools normal users know of I must say I cannot recognize that this is a weak point in Windows today. Clicking the X to close an app and getting stuck with a dead app is much worse in, say, Debian than in Windows.

With that said.. I don't use Windows where I have a choice but that is mainly for philosophical reasons these days. Mac I don't touch. I feel it is the worst of both camps.


> Applications that hang and can't be killed is a much bigger problem to me in Linux than in Windows 10.

Have you seen anything in Linux that you can't kill with "kill -9"?


On Linux, processes that are stuck in D state (waiting on I/O) cannot be signaled. More specifically, the signal will be queued until the task exits that state. This includes signal 9.

The process may well never exit that state, for example if the I/O it's waiting for is actually over a networked filesystem and the NIC is misbehaving.


Fortunately we have TASK_KILLABLE to replace TASK_UNINTERRUPTIBLE. https://lwn.net/Articles/288056/


I have a nice shortcut : Windows+K calls "xkill" :)


I wrote a long comment on my phone telling you how this is simply untrue.

During the typing of the comment, Safari locked up and then hard-crashed, losing my comment.

If that happens on something as tightly-controlled as iOS, how many bugs and crashes do you think macOS and it’s apps experience?

The main difference between macOS and Windows or Linux, I’d say, is that bugs on macOS are more submarine. Windows also has much stronger recovery mechanisms, to the point where a GPU crash barely phases it. A GPU crash on macOS will hard-reboot the system and possibly show you a ‘:-)’.


How long ago did you switch to Mac OS? This sounds like an argument from 2004, not 2022.


When the Intel Mac arrived, so maybe 14 years ago or more?

Are you saying Windows has fixed all these problems?

The registry is still there.

I do actually have some Windows machines I use sometimes and recall thinking "still the same" but I can't say that as a hard core user, so I'd be interested to hear if Windows is now a sleek, reliable multitasker that instantly kills dead applications. Nothing will make me OK with the registry and the general mess of Windows though - it's like a house someone hasn't properly cleaned for 40 yeaes.


I feel like if you're bothered simply by the existence of the registry, Windows isn't really for you. I get it, a little bit -- the OOM killer on Linux just boggles my mind, but once you view in context, you can appreciate how we ended with it, at a technical level.

But if you think the registry could be replaced with .ini files like the good ol' days, that's a pretty extreme hot take. If you're open to changing your perspective on what the registry is for, how it is designed, and why it's necessary, read any of the Windows Internals books.


I certainly get a lot less blue screens and lockups than I did in the past, and the OS shuts down locked-up apps more quickly.

Group policies have replaced the registry for advanced configuration of the OS. The only time I’ve needed to change the registry is for dealing with poorly written, old drivers, which are becoming more rare as Microsoft’s standards for getting a driver signed are becoming more stringent.


The rants go both ways. Others see value with the registry (ideologically), others more like you https://news.ycombinator.com/item?id=32275078


> Are you saying Windows has fixed all these problems?

From my own personal experience, no. My machines can become unusable from a simple Windows Update.


> Add to all that, Window slows down over time - again, the periodic reformat and rebuild is part of life on Windows.

My windows install is from 201*, it was an upgrade from win7, and switched from an i7 to a ryzen. I do clean out the registry/startup/task scheduler on occasion.


> My windows install is from 201*

The early 3rd century was the best era for Microsoft with Windows Severan.


The early 3rd century was an exciting time for computing. The big issue back then was Roman numerals were hard to convert to hex and binary.


Back then mainframes had specialized hardware to deal with them.

Note: the other day I read ICL mainframes had specialized instructions to deal with pre-decimal pounds.


One of the IBM 1401 computers (1959) at the Computer History Museum has hardware support for pre-decimal pounds/shillings/pence, to perform arithmetic on these values and print them. Thus, the computer has three fundamental datatypes in hardware: arbitrary-length strings, arbitrary-length integers, and pounds/shillings/pence. Of course there were two competing standards for encoding pounds/shillings/pence, so the front panel has a knob to select the standard.

(Before decimalization, there were 12 pence in a shilling and 20 shillings in a pound. Mathematical operations on currency were both difficult and extremely common, so IBM provided hardware support as an option. There were boards of transistors so you could add, subtract, multiply, or divide currency values with a single operation, rather than an inconvenient sequence of instructions.)

Photos in my Twitter thread: https://twitter.com/kenshirriff/status/1364365985499602947

To get back to the original topic, I'll mention that my MacBook Air would drop keystrokes if I visited a website with, say, a video ad. I find it kind of appalling that computers in the 1960s could handle input from hundreds of keyboards at once, while a 2017 computer can't manage a single keyboard.


I didn't know the 1401 had that!

In the defense of your Mac, computers of the 1960's had terminal controllers for dealing with the communications. And 3270's were like web browsers - getting a form, sending the data back, getting another one - an excellent design for avoiding hardware interrupts ;-)

But yes. I've seen 370's with a good couple hundred 3278's connected and that thing was still quite snappy, even though the CPU in my watch can run rings around it. I guess even two VT330's would stress my current laptop if I could type on two keyboards fast enough ;-)


The widespread adoption of CISC architectures solved this problem, since many of them included special-purpose BCR (Binary-Encoded Roman) instructions.


"Antikythera Mechanisms" was the biggest player in 3rd century computing as I recall.

They seem to have sunk without a trace.


Well played, thank you for the chuckle.


2015, for a year or so on insider builds, upgrated i5 to ryzen, migrated primary partition from SATA SSD to M.2 SSD

no registry cleaning, only cleaning temp files with built-in app in Settings


I think all desktop OSs have it wrong. Allowing full APIs to backgrounded apps and no CPU reservation for control apps will ultimately end desktop computing - we’ll create on iPads because they work.


I'm writing this on an old i3 and it feels perfectly fine. It would probably take days to compile a large project like Chrome or LibreOffice, but, as long as I run the make under a nice, I can continue using the computer and the only sign something big is happening will be the CPU fan kicking in.


> as long as I run the make under a nice

Regular users don't understand CPU quotas. It should be done by default. The current behavior on Windows/Mac/Linux of allowing a program to make the machine uncontrollable is poor.


This will probably change as we get more cores and performance/efficiency ones. Macs already run a lot of the low-priority tasks on their slow cores leaving the fast ones available for interactive activities.

I don't disagree on the current approach though. If I start a, say, long build job from a terminal window, I have control over the resources it can take. If, OTOH, I start a program from the GUI, it's reasonable to assume I want it to have full control, at least while it has the focus.

Now, letting the GUI dynamically renice non-focused processes would be very nice.


More cores won’t fix it. Some app will add more processes and use all available CPUs and there’s no reservations.

You had it right with nice - phones and tablets and consoles and cars have it right with their techniques. Apps shouldn’t be able to make the system unusable by default.


> More cores won’t fix it.

By themselves, no, but there are limits to how many cores a workload can keep fed. You'll see declining returns at high core counts (which is one of the reasons we don't have Xeon Phis these days).


> This is one of the primary resaons I ditched Windows for Macos.

The raison I just ditched Fedora for ubuntu...

The raison I ditched MacOS for Windows is that MacOS is slowly but surely becoming a magnified version of iOS... One day you will have to jailbreak if you want administrator rights on your ~~personal computer~~ Macbook.


Don't MacOS and Unixes have the equivalent of registry hell with a million configuration files hidden away in /etc/ or /usr/ or wherever, and then having to check where environment variables are set and so on? Ideally, almost all of a program's configuration settings would be stored in the same folder as the program, readily discoverable, and you could just pick these things up and move them easily, but in practice, it seems to be a pain everywhere.


The registry is uniquely crap because it puts all the configuration for everything into some single store - I don't know what devilish format lies beneath it, then it stores everything as binary keys. Things are not separated strictly by application so you end up with a giant pile of intermingled goo.

In Linux, no doubt there's alot to get your head around, but I've never found things to be a giant pile of spaghetti. The challenge Linux has is there's a bajillion ways to do things so you have to be pretty experienced to feel confident crawling around the tunnels and ventilation ducts.


The disfigured nature of the registry is a real shame, I actually like the idea of centralizing all configuration in Windows. I wonder if there was a better solution in Cairo.

To be honest (and I'm aware that I might be tarred and feathered for this) I kinda prefer the centralized Windows registry to the "random assortment of config files" approach that Linux has. I think both Windows and Linux give you that feeling of carefully crawling around in ventilation ducts once you get your hands dirty; I doubt it's possible to get rid of that without getting rid of the inherent flexibility that the systems provide. There's a reason Windows has both the registry and the Settings app.


NixOS and Guix come close to this with a central spot for declaring the entire system state, including program configuration and env vars. Now there is still the $HOME/.config mess, which home-manager[0] tries to tackle.

[0] https://github.com/nix-community/home-manager


From a typical user’s perspective, macOS puts all Preferences files into ~<username>/Library/Application Support/<app bundle name> or ~<username>/Library/Preferences/<app bundle name>, and they are standard Plist files readable by both command line and GUI tools that ship with the system, so easily editable and human-readable. From a power-user’s perspective, it is a Unix system and that part of it is the usual mess of Unix config files. I’m definitely a power user but rarely have any need to touch a Unix config file on my macOS systems.


This is also why I ditched Windows, about 15 year ago. But honestly, I get more ui freezes and beach balls today with Monterey than I do with Win 10. Windows has been getting better while MacOS has been slipping.


I hold out forlorn hope that someday my modern workstation will achieve the responsiveness and fast boot times of my Commodore 128. Not joking in the least.


I worked with a brilliant engineer who had previously worked at one of the major hard drive manufacturers. He told me about a project he'd worked where the disk manufacturer had their firmware writers collaborate with low-level OS programmers to significantly speed up modern PC boot times. The conclusion was yes they could technically do it, but they found so many peripherals and device drivers that relied on the boot being slow (either intentionally or unintentionally - e.g. hardware or software race conditions, device drivers not actually making sure the device is ready yet, etc.) that there was "no point."


I've seen code that

(a) issued an async read

(b) did some computation

(c) used the buffer filled in by that async read

... without actually seeing if the read completed. Hilarity ensued when the CPU got faster. This was in stuff that shipped to hundreds of thousands of customers.

If you're having a good day, you can definitely address that problem by reading device drivers for a few hours.


It seems like this would be classic Apple territory, to one up their competitors via their unique vertically integrated position.


Yes it is. I’ve heard from Hackintosh Overclocker types that Apples ACPI[1] tables are the best and most of the PC motherboard manufacturers are compete garbage. Including WTF how does this hardware even function moments.

[1] https://en.m.wikipedia.org/wiki/ACPI


I assume things are better these days, but back in 2005 the thing that lead me to switch to mac was seeing a friend with a powerbook which had a sleep mode that actually worked.


I grabbed my (sleeping, lid closed) macbook from my desk this morning. When I pulled the charger to put it into my bag, I started hearing the faint audio of the youtube video I was playing when I closed the lid last night.

Macs today sleep like Windows machines a decade ago


On a tangential note, that reminds me of my mom's former (Asus) laptop - it advertised booting up in 2 seconds, and actually was pretty close! I didn't think much beyond "must be a fast SSD with skipping the bios screen"... until while debugging some storage issues I realized it was an HDD XD

I guess when manufacturers want to do something well they really can integrate everything.


I feel the same way about TV. I miss the ability to quickly flip through channels.

When I was younger and my family had our first internet connection my dad said "One day internet webpages will load like changing channels on the TV - click click click. instant page loads"

Today TV soo much slower and webpage are... well... you know.


I played around with the `dillo` web browser a couple years ago. It doesn't support javascript, but it was oh so amazing to experience it as you describe.


I still love using Dillo or w3m to browse a simpler web at times.


Not sure if you’re familiar (it’s made the rounds a few times) but the Gemini Protocol (HTTPS replacement) and corresponding Gemtext format (Basically Markdown not even HTML really) is very cool internet space. https://gemini.circumlunar.space/


So the prediction came true - webpages and tv channels load at the same speed.


Monkey's paw curls.


I have a long break over the holiday and am thinking of putting a software project out.

Do you think I should consider making it a priority to code in a fairly low-level programming language (e.g. Rust) without overhead, and count cycles so that most tasks are done within a single screen refresh?

I can't make the rest of users' systems more responsive but I could make my own software as fast and efficient as possible.

The other alternatives are Python, which would make it easier to ship large features but would come with a large overhead (full interpreter) since it's an interpreted language.


This is usually the wrong approach. Make sure you don’t do computations that block the UI, avoid accidentally quadratic behavior, and make sure you process UI events promptly. You can do this is pretty much any language, with minor caveats:

Some languages (e.g. Haskell) make it easy to write code that does more computation than intended. If you use one of these languages, make sure you know what you’re doing.

If you use a language with a truly horrible GC, you might experience excessively long pauses. Similarly, if you produce too much garbage, you might have issues.

If you use a language that can’t multithread properly (sigh, Python), moving tasks off thread is a mess.

Otherwise, one can write perfectly responsive software in just about any language.


I'd do it in C, but that's just me. If this is a one-person project you can hold you code to a high standard. Clean standard C11 with all warnings turned on (-Wall Wextra Wpendantic Wconversion). Write unit tests. Run them with valgrand/sanitizers. Use clang-format. Build with multiple compilers (GCC/clang/MSVC). ...

EDIT Not many hackers on hackernews apparently.


I’m curious as to the upside of C, is it mostly just a familiarity thing?

Holding your own code to a high standard is great, but wouldn’t it be nicer if you could offload more of that into the tooling and spend more time on the problem or making the code even cleaner?

Also as a side note: I always find it funny that a flag that represents all warnings doesn’t actually turn on all warnings.


For my own projects I use C because I know it, but also because I know and trust the ecosystem. I seem to have developed something of a software-survivalist mentality, so I like to know that I can build my project five years from now without worrying about whether some remotely-hosted dependency isn't there any more. (I'm not claiming that newer, more trendy languages necessarily fail this test - just that I don't know and trust their ecosystems well enough to be sure.)

C also has the advantage that there are many, many different compilers targeting many, many, /many/ different architectures. My own favourite is VBCC, which is lightweight enough that I was able to write my own backend for my own toy CPU project, and even build the entire toolchain - assembler, linker and compiler - under AmigaOS.


I think the remotely hosted dependency problem and ecosystem thing is not that big of a deal all the time, but if the project involves atypical OS and architectures, then C makes sense. I think it’s still possible to do these in other languages, but that’s just another familiarity gap to add to the existing set.

For the remotely hosted dependency thing, I think it’s pretty easy to vendor dependencies in most realistic C contenders, and a really simple litmus test is just yanking your network cable and doing a fresh build.

The ecosystem thing can be a bigger deal, but again it really depends on the problem domain. There are lots of high quality, non-C libraries for C to always be a clear winner. I think it’s more important to take a step back and make sure whatever language you pick is well suited to the task, rather than assuming any individual one will always be. Knowing multiple languages is handy for this, since that’s more ecosystems that you can pick from, rather than just tying yourself to a single one.


> The ecosystem thing can be a bigger deal, but again it really depends on the problem domain.

Yes, absolutely - and of course C isn't immune to ecosystem problems, either. I remember the pain of working with GNU autotools back in the mid 2000s - in fact it's probably that experience (plus trying to use bleeding-edge tools written in Python!) that left me so cautious about external dependencies today.


Out of curiosity, when you considered the compilers to use for your CPU project, did you look at pcc? And if so, how would you say vbcc compares, in terms of ease of porting to a new arch?


I looked at a number of different options (http://retroramblings.net/?p=1277) but didn't spend a great deal of time looking at pcc. I was amazed to discover how many different options there were, actually!

VBCC's backend interface is well documented, which helps a lot - and there's a skeleton "generic RISC" backend which is trivial to copy and use as a starting point - I found it very useful to be able to tweak a working backend and observe how the generated code changes, while I was getting a feel for how it all hangs together.

VBCC does have an unusual license, however - commercial usage requires permission from the author.


I found the opposite in a very real-world analysis of my hobby project to build a digital dashboard for my DeLorean. I spent 5 years making very slow progress with C++, and then switched to perl, started mostly from scratch, and finished in a year. I gave a talk about this exact topic at YAPC 2014: https://youtu.be/SERH3_gZOTo?t=1018 The CPU usage on the embedded PC went from 15% to 40%, but in the grand scheme I'd rather have it finished and pay a little more for the hardware.


> Do you think I should consider making it a priority to code in a fairly low-level programming language (e.g. Rust)

Maybe. Premature optimization is said to be the root of all evil...

But at the same time - the assumption that everything will be easier and faster in Python rather than Rust or C++ is often invalid. Sure, for smaller scripts it almost always is like that, but once your app grows, this may stop being the case.

Start with a language that's convenient for you and with which you can release an initial version. Then get an understanding how it behaves in terms of performance, and draw your conclusions.

> as fast and efficient as possible.

Responsiveness is not the same as speed or efficiency. Of course it's important to be fast and efficient, but it is even more important to not just start crunching numbers and ignore the user and the rest of the system.

Do your hard lifting asynchronously and have a thread attending to user input and your UI (or even different threads for these two tasks). And this is easier said than done!

Also remember you'll have to try and work around delays and slowdowns due to other apps and the (non-realtime) OS. Specifically, you might have to play with thread scheduling and I/O priority (although - that's usually the user's rather than the app's job).

Additional notes:

* Also consider C++; it has some advantages and disadvantages relative to Rust (which I obviously will not get into), but it has seen a whole lot of progress in recent years, in particular w.r.t. the ease of doing many things which used to be painful.

* If you think of Rust as low-level, then your head must be in the clouds... :-P


> Maybe. Premature optimization is said to be the root of all evil...

The full quote, because it always gets butchered to "premature optimization is the root of all evil":

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.


I think this quote is pretty funny in hindsight. Since it was said, we've had an entire generation of OOP muppets, and now a second one of modern web developers going that way, that add layers of pointless abstraction and complete bullshit on top of everything they write, just for fun.

In the 70s you might've needed a reason like optimisation to commit evil. These days apathy and cargo culting do it perfectly fine without optimisation even coming into the picture.


Cycle-counting is useless on modern workstations, since different processor are used in each. I don't see how the choice of language is significant either, as long as it meets your goals for responsiveness. Goal could be instead to limit latency where possible (measuring time from click to completed action is easy with a framework), or communicate to the user when an action will take a noticable amount of time.


Agreed. The 2 choke points you optimize in algo returns the loss on choosing an otherwise better lang/environment for productivity.

Sometimes not though, like startup times, but there is not so many use cases that this counts, like 1password starts slower with each release.


Can still be done if all you want is a command-line to come up and be responsive.

Most folks want web, email, images, and even a bit of security from their computer today. Could always be faster, but I think the days of "bam!" ready are past due to those requirements. Unless the computer is only sleeping, like an iphone for example.


Those things don't preclude responsiveness though. Modern software is sitting on top of mountains of inefficiencies and legacy baggage. This was found just months ago https://arstechnica.com/gadgets/2022/09/20-year-old-linux-wo... "A bit of security" is also quite apt, since there's huge room for improvement there too. I wish I could find a way to get paid to improve all this. Currently I'm only doing it for mobile apps.


>Most folks want web, email, images, and even a bit of security from their computer today. Could always be faster, but I think the days of "bam!" ready are past due to those requirements. Unless the computer is only sleeping, like an iphone for example.

Sure, but a modern computer also has orders of magnitude more processing power than devices that only ran a command line. There is no reason that we cant have both, except incompetence and/or bad economic incentives.


While I agree that incompetence and bad incentives are ever–present, and that all software can be optimized by application of a little elbow grease, I disagree that these are the only reasons.

For example, properly displaying text now requires having a copy of at least the most important parts of the Unicode standard in memory. Things like knowing if a character occupies one, two, or many character cells are very important for even a simple text terminal. Word splitting, kerning, shaping, bi–directional text display, the number of possible refinements grows without bound and all of them need metadata about each and every character. Your average web browser has megabytes of the stuff just sitting around in memory so that it is ready as soon as a character has to go up on the screen. Older computer just didn’t have that kind of memory to spare.

If you want the old–school experience you can still boot straight into the Linux Console, which still thinks that there are only 256 characters. It seems to be reasonably snappy.


You’ll get the exact same snappiness with xterm running on xserver (with no desktop or WM), despite all the Unicode and font rendering support.


Folks will pay for simple and ready immediately, or complicated, multi-user, and rich in a minute. (Plus significant backwards compat.) There is no big need for complicated and ready now. Pulling in the long tail of functionality often desired takes time, and it is mostly IO-bound not CPU bound, which hasn't increased as fast.

I suppose you could put a ton of OS and libraries in a (modern equiv of EEP)ROM for immediate access. The original Macintosh was kinda like that. But things went the other way when disk storage dropped in price a lot faster than chip storage.

Maybe we're back to a point where the former is economically viable, but there would be a lot of historic baggage to overcome.


web, email, images and a bit of security are not the reason windows UI is laggy as fuck


I agree. Around 2008 when I got my first SSD installed, windows 7 and PCLinuxOS2007 would open apps like Word instantly. You'd click and it'd be there within 100ms.

Now with windows 10 on a much faster pc, clicking apps like Firefox, the start menu, word take seconds to load.

I recently fired up my windows 7 VM to mod an old game console and it was exactly how I remembered. You click something and it's instantly opened.


Thankfully, you are welcome to use something else that puts your needs before that of big brother(s).


The average user doesn't even boot the computer. It's just always on and wakes pretty much instantly.


Not a C64 user, but I did use Apples, Ataris, and IBMs. Booting to BASIC or a cartridge was fast. Booting to a floppy not as much.

Reading a tape? That took several minutes.


The boot was fast on the c64. Essentially instant.

It just dumped you into a prompt though, and loading any program was really quite painfully slow.


It could be just as fast if the bios on a PC did absolutely nothing other than initialize the motherboard and a small basic interpreter.

A modern bios goes through more cycles during the boot phase of a typical machine than a C64 would see in its entire lifetime.


On early IBM PCs (up until around the introduction of the PS/2 anyway) there was a key sequence to boot direct to BASICA, and it was always about as fast as the 8bits were - except that most of the 8bits didn't have the slow memory check that IBM insisted on.


It will be faster actually.


Every parts of modern computer system introduce unavoidable latency: monitor has ~50ms latency, bluetooth keyboard has ~70ms latency. Even the desktop compositor has latency (~10ms). Then you have another latency in the application layer, typically the slow text rendering stacks like GDI or Cairo (~50ms). It's basically impossible to escape them in a modern system these days.


Tangent: Moving the mouse made Windows 95 faster

https://www.pcgamer.com/this-theory-suggests-wiggling-the-mo...


So it wasn't just my imagination that installers became faster when doing that...


At 40 this behavior is ingrained in me and influences my behavior while using a computer to this day.


Back in the early 00s my cousin had a modem that would only transfer data if either the mouse or the keyboard was in use.


Wow, I've never heard anyone else describe that.. but I had the same on Win95 (iirc) after messing about with the COM port setup too much. You could try to load a website and see there was no transmission going on whatsoever.. but then if you grabbed the mouse and moved it around you would instantly see data transfer happening again and the page would start to load.


In the follow-up blog post (https://randomascii.wordpress.com/2017/07/27/what-is-windows...) he uses a sampler to try to find the bottleneck by looking at CPU instructions, but concludes:

> But the main thing I always realize when using this technique is that modern CPUs are weird and confusing. Because CPUs are massively out-of-order and super-scalar it is not at all clear what it means for a sampling interrupt to hit “on” a particular instruction. If an instruction is particularly expensive then samples are more likely to hit “near there” but I’m sure where “near there” actually is:

> If there are three instructions executing simultaneously when the interrupt fires then which one “gets the blame”?

> If a load instruction misses in the cache and forces the CPU to wait then will the samples show up on the load instruction, or on the first use of the data? Both seem to happen.

> If a branch is mispredicted then will the samples show up on the branch instruction or on the branch target?

> What’s going on with the expensive cmp instruction on line 24 of the spreadsheet?

> If anyone has a good model for what happens to the CPU pipelines when a sampling interrupt happens I would appreciate that. Ideally that would explain the relationship between clusters of samples and expensive instructions.

I have a distinct memory of reading a blog post probably 5-ish years ago where someone did just that. The author started with some microbenchmarks of very tight loops, and used some sort of profiler/perf counter tool that measured hit counts for each instruction of the loop. Then, they went into a deep dive into analyzing the instruction throughputs and latencies and dependency chains to demonstrate how bottlenecks at the CPU level manifested as clusters of samples, and how to use this information to optimize the loop.

Does anybody else remember this post, and possibly where I can find it? I’ve been in a couple situations where it would have been tremendously helpful, but I just haven’t been able to dig it up.



That’s it, thank you so much!


Sounds like the work of Agner Fog.


Ctrl-F in a 10MB PDF in Adobe DC lags when typing while (I guess) it builds the search suggestion index. On an i9 11th gen with over 20000 CPUmark, and 64 GB of RAM. Madness.


The only reason I would use Adobe over SumatraPDF is the need to open a PDF that use JavaScript, have 3D content inside, or has to be digitally signed.

Probably SumatraPDF is a lot faster because can't do all of those fancy things, but for reading a document and look for some keywords it's absolutely fantastic, and I don't need something better in 99% of the time.


Moving the mouse is the worst case scenario for a deep multicore processor because (1) it is a single-threaded latency sensitive task, (2) there are 23 other cores that can grab a lock and prevent the 1 core that matters from doing its job in a timely manner.


With regard to latency sensitivity, with so many cores available, a single core could be dedicated to processing mouse input. Of course, if there's a lock in the way that's held, that's still a problem.


For a modern system like Wayland or WDM, the display compositor gets rectangles of pixels from the applications, copies those into a big rectangle it sends to the rasterizer. Just before it does that, it draws on a mouse cursor. It also has to communicate with the applications about what image to draw so you have to deal with multiple threads no matter what.

(Unless you go back to the late 8-bit era where the cursor might be a just a hardware sprite that can be moved around by writing a few bytes.)


I believe the mouse cursor is still often a hardware sprite today.


Almost 100% guaranteed, but it is still possible for the OS to be delayed in sending it the coordinates. Of course that means that the OS is badly written.


By these standards, the only well-written operating systems I've ever seen are BeOS and QNX/Photon. Probably iOS (at least, older versions of it) deserves to be on the list, despite mouse use being atypical.

Which... yeah, that actually might be true.


iOS is the odd one out, because user input has first priority. IIRC, it cannot even do network I/O if you are holding your finger on the screen; all interrupts except those from the touchscreen are disabled for the duration.


I can say for sure this is true on sway (wayland) because while messing around seeing if the proprietary nvidia driver finally works yet (when will I learn... for anyone curious nvidia is still nvidia) one of the things I had to do was manually disable hardware cursors.


sure, no problem, negotiate with the application, but also act as a responsible shepherd of the desktop, if some timeout is exceeded draw the "application is busy" cursor.

this already happens, just after some laughably long chain of unprocessed input events


Or if it needs code that’s paged out or all cores are thermal limited or the GPU hung. CPU time usually isn’t the limiting factor.


So the 23 cores are there only to starve the 24th ? Sounds almost human.


Where are the mouse coprocessor equipped computers?


Hardware sprites (and presumably interrupts) made the Amiga mouse pointer move fluidly and responsively even when the CPU was busy. No complex graphics pipelines or laggy LCD screens to add further latency back then either.


Indeed - it's no deep magic, "all" you have to do is update the sprite's position from within the very same interrupt which reads the mouse counters. Oh, and make sure that nothing involved in that process can be paged out, ever. (The Amiga made sure of that by not supporting virtual memory!)


I do miss aiming a particle accelerator at my brain as a HCI method.


It was just form of 2D acceleration.

We even see it now, sometimes called "hardware cursor" in various games settings, althought pipeline is much longer.

Just that old small hardware had little to no memory protection and very tight integration so stuff like that could be done directly instead of going thru many layers of abstraction


Not sure if you missed the point or I'm missing yours, but the post you replied to points out that CRT displays sound insane in a certain light...


You can do it with very little. Prime example: https://www.folklore.org/StoryView.py?project=Macintosh&stor... drawing a mouse pointer on an almost stock Apple II, flicker-free because drawing is synced to the vertical blank interval, even though the hardware cannot see when that happens (that part I don’t quite understand; I can see them detecting ‘end of screen’ on an almost blank screen and programming the timer to generate a periodic interrupt at about the screen redraw frequency, but wouldn’t the VBL and the 6522 interrupts drift away from each other over time?)


Mouse cursors are mostly handled in hardware. GPUs composit the cursor during scanout, so all the OS has to do is calculate the new coordinates of the cursor and tell the GPU about them.


> Mouse cursors are mostly handled in hardware. GPUs composit the cursor during scanout, so all the OS has to do is calculate the new coordinates of the cursor and tell the GPU about them.

Is this true on modern Linux DEs (e.g. on KDE Plasma)?

Is it also true on Windows and macOS?


I don't know about other platforms, but in Windows, it had been true >20 years ago, before the desktop was even composited. Some fullscreen games also relied on this functionality, and sometimes you had an option in game settings to switch between hardware-accelerated cursor and manually drawn one (because the former could be buggy with some video drivers).


This is fairly basic functionality. Windows and wlroots-based compositors have it at least. I'm reasonably certain that other major compositors have it too.


Did you check?

I tell my QA people all the time: If you come to me with "I think" or "I believe", it sounds like you've got some reading to do.


I didn't write "I think" or "I believe".

If you want to know my epistemic status: I know about windows based on observable surface behavior and bugs related to accelerated cursors, but I haven't looked at the source. I know about wlroots because I saw the pull requests related to that and a flag to disable it in sway. The last statement regarding other platforms was an educated guess based on gfx card history: accelerated overlays are an ancient feature present in a lot of hardware, not some newfangled niche feature.

And I'm not one of your QA people.


You can actually test this pretty easily by monitoring GPU usage. When nothing on the screen changes, GPU usage should be pretty much zero, even at the lowest P-state (because the compositor isn't rendering new frames unless something changed). You can move the cursor around and GPU usage will stay at the same level, but do something that actually requires new frames to be rendered, like for example dragging a selection rectangle on the desktop, and you'll see GPU usage go up a bit. (And if you do the same in applications using GPU-accelerated frameworks you'll often see the GPU rev up a few P-states, for example scrolling in the Steam library list is enough to cause a transition for P2 for me, likewise many websites kick it up to P2 when you move the mouse or scroll). Another way to see how the cursor is not using the regular rendering pipeline is to move a window around. You'll invariably see that the window lags behind the cursor by 1-3 frames. Other ways where this becomes noticeable is when the graphics stack breaks down, but the cursor still works.


Just an FYI-- I tried to get ChatGPT to generate a question as condescending as what you just responded to. It now generally refuses to write anything impolite.

The most I could get was to ask it to take an avuncular tone, at which point it did ask you to "spill the beans, kiddo" about these compositors. :)


Optical mice themselves can have quite a complicated processor onboard. Luckily the image processing isn't done on the CPU/GPU.

I wonder if anyone ever sold a "software optical mouse".


> I wonder if anyone ever sold a "software optical mouse".

You could if you really want to...

https://8051enthusiast.github.io/2020/04/14/003-Stream_Video...



Discussed at the time:

24-core CPU and I can’t move my mouse - https://news.ycombinator.com/item?id=14733829 - July 2017 (499 comments)


From reading it over the root cause was in a massively-multiprocess application. In general this is the usual problem of "on windows use threads not processes for ephemeral parallel tasks" right?


There are also just some workloads that will saturate things. If you hit the mmap infrastructure hard enough on Windows you can saturate the kernel and your mouse cursor will stop moving, I managed to do it with a custom key/value store written in C# once I spun up enough threads hitting mapped files at once.

I would assume that on win10/11 (this was a while ago) it is harder to saturate a modern machine, though, especially if you lower the i/o priority of your process.


Ask me about how my couple-generations-ago ipad pro (just before the M1) sometimes randomly freezes when I try to scroll a pdf...


Sounds like Windows needs this 200 line kernel patch that sped things up for linux in 2010 [1]

"The patch being talked about is designed to automatically create task groups per TTY in an effort to improve the desktop interactivity under system strain."

"Tests done by Mike show the maximum latency dropping by over ten times and the average latency of the desktop by about 60 times."

[1] https://www.phoronix.com/review/linux_2637_video


I fell off the chair when I read the headline. I get that software runs insanely fast on modern hardware and that often, the naive solution works with acceptable performance. But boy -- can we not set the bar as low as "barely responsive"? If your thing doesn't run on a Raspberry PI, it's probably garbage.


The trouble is no matter how much fast RAM, how much of a beast of your GPU is. How many threads your CPU has.

The UX is as hangy as the laggiest component.

That means if the Windows programmers decided that it needs to wait for every HDD to spin up to show you the contents of your start menu you're gonna have a bad time.


If I recall, this is mainly caused by the shift to USB as the connector du jour. Since it's a poll centric protocol, under heavy load, system responsiveness decays.

Back in the days of P/S 2, where input was interrupt driven, the background tasks would be interrupted by input coming in, ensuring the user could move the mouse around. Now whether they could get the outcome of a click to register is a different story. The click would, but there'd be no guarantee the click handler would execute before background processing resumed, because it'd be considered an event that'd get scheduled after whatever is currently starving the CPU thread.


Sure, on the wire USB is a polling protocol. But the polling is done by the host controller hardware, which will raise an interrupt just like PS/2 whenever the device responds to the polling as having data available.


The polling by the host controller is still limited to happening once every bInterval period as specified by the device, which for mouse/keyboard will be at least a few milliseconds, v.s. the old school PS2 style mice which would send hardware interrupts straight to the processor.


Yes, but that sort of 'delay' will be there all the time, with no relation to the system being loaded (like the top commenter is claiming).


As I understand it, the polling is done by the USB controller, not the operating system or anything happening in the system's CPU.


The article is about a bug in process destruction, if I'm not mistaken. The mouse bit is a bit of an aside. More of an obvious symptom that something is wrong.

Not that interrupt based input processing didn't have some advantages. Just this shouldn't be one of them.


On the MCU (device) side, you can command the poll in a data-received interrupt. Does it not work that way on the PC side? Ie, you never have to really block on the MCU.


It's the same for the host controller hardware, it exposes different interrupts which the host controller driver can make use of so you don't have to busy-wait. The details depend on the host controller interface the hardware provides though.


Hyperthreads aren't real CPU cores, claiming that your 48 hyperthreads were "only" 50% utilized means all the real cores under the hood are fully utilized.


> Hyperthreads aren't real CPU cores

I can't think of a good principled reason to say SMT cores aren't "real".

I'm assuming the answer isn't "because they share some computing stuff". What I think you'd call "real" cores also share resources like L2/L3 caches, sometimes DMA engines, etc.

And IIRC, each Intel SMT (hyper-thread) unit has its on instruction pointer and (non-SIMD?) register set.


> And IIRC, each Intel SMT (hyper-thread) unit has its on instruction pointer and (non-SIMD?) register set.

I believe the SMTs share the register rename storage, which I'd say is the register set more than the 'architectural registers'

But I'd say the reason they're not real is because they don't increase the maximum instructions per clock. With many loads, they do increase the average instructions per clock, but they don't let you do anymore work if you've got a fully tuned load that uses all the computing resources.


In reality you never have a finely tuned load that uses all the resources all of the time. Processors spend a lot of time waiting for memory, so hyperthreading allows for better utilization of compute. It’s rare to have a workload that benefits from turning it off, and in those cases it’s usually because the hyperthread is hurting the cache hit rate enough to offset the gains.


I recently checked whether an ugly hack I implemented years ago for 30% performance gains ("halve the default OpenMP thread count at start of computation and restore it afterwards" -> effectively, disable hyperthreading) was still necessary. It's now apparently a 3x performance gain... All because I'm saturating memory bandwidth. I don't know whether to be happy or sad...


Ouch, looks like they’re clobbering each others cache. Also, well done. It looks like you optimized the hell out of your code.


Both threads are waiting on the same memory pipeline. The actual performance has always lagged behind the theory


If one thread has to hit main memory but the other one can get what it needs from L1 cache they aren't competing for the same resources.


But not the same memory operations. Unless your code fully saturates the memory bandwidth, which is rare, you get some gains here.


It is almost never about saturating the memory bandwidth, but waiting to load instructions and data from memory. That wait time counted in computer cycles is huge.


Yes, that’s precisely why hyperthreading is such a good deal.


“Such” a good deal is 2-30% in benchmarks (and they don’t say if that’s with cache leak protections turned on or off). In previous generations it was more like -20-20%. If one thread is having issues with L1 and L2 cache, splitting that evenly with a completely different workflow isn’t going to help.

If cache contention weren’t a problem, and it was just a matter of jumping into previously unseen instructions and data (cold cache), you’d expect to see 50-300% numbers from hyperthreading, precisely because of how long the stalls are.


Many years ago a customer on a single core computer had problems of heavy video stutter in my product. It was multithreaded and ran as I recall 5 threads (GUI, video decoding, 3d pipeline, device control and computation). Others did not have this problem. After investigation it turned out that said customer had hyperthreading disabled in BIOS. Enabling it fixed the problem instantly.

So "real" or not but from my experience HT does work to the benefit.


I don’t know how many hardware revisions Intel went through where every benchmark said hyperthreading was slower than turning it off, but it was a lot. It became Lucy’s football at some point.

And in these days of post-Dennard scaling, you have thermal throttling, so idle cycles aren’t actually idle, they’re allowing the heat sink to catch up with heat production.


I agree with this stance. It's the same as branch prediction, out of order execution, or anything else about computing efficiently.


> I can't think of a good principled reason to say SMT cores aren't "real".

It really depends on your definition of "real", yes you can treat them like "real" independent cores but that's not ideal for performance because under the hood they're not actually independent. Your operating system is aware of this and will often avoid scheduling two tasks onto the same physical core unless it has to. If you have 20 logical cores (10 physical ones with hyperthreading) and 10 tasks to execute, the OS scheduler will usually allocate one task to each physical core and leave the other logical core idle.


I did an experiment and effectively proved this to myself many years ago.

I had just upgraded to an i7-3770K (8 threads, 4 cores) from a Core 2 Quad (4 threads, 4 cores). I did a POV-Ray render several times using 1, 2, 4, and 8 threads. 2 was nearly double the speed of 1, 4 was nearly double the speed of 2, but 8 was only about 15% faster than 4.

To ensure I wasn't bottlenecking RAM at that level, I tried again with 2 threads, but forced both threads onto a single core, and it was only 15% faster than 1 thread.

That was all the proof I needed of how you really can't treat two CPU threads as two cores.

That said, I've never personally found an instance where allowing a processes to span all CPU threads actually reduced performance, and I'm not sure I've ever seen a real-world case where it does. It's usually something contrived.


That's one use case. SMT is not designed to address it. POV-Ray will saturate your execution resources in the core and adding another frontend doesn't change the facts.

SMT is designed to hide memory load latency. In this use case it is completely brilliant. You will get 2x speedups with hyperthreading when randomly accessing memory.


> You will get 2x speedups with hyperthreading when randomly accessing memory.

...unlesss you access so much memory (while doing relatively little compute on each piece) that hyperthreading just causes more cache invalidation.


When I bought my current desktop the two CPUs I was looking at were the i7-9700K and i9-9900K. This was the generation where intel went from 6 cores with HT to 8 cores w/o HT for the i7. The i9 has 8 cores with HT.

I liked the idea of the i7 because without HT you have a real picture of how utilized your CPU is. If it says 100% on a core it's 100%. I ended up going with the i9 though when a promotion was too good to pass up.

When I got it I ran Cinebench both with and without HT enabled in the BIOS and it made a decent impact so I left it on.


An underappreciated consequence of this is that if doubling the number of cores doubles performance (meaning the multi-threading is good) but going on to the hyperthreads only gives a 15% speedup then that means that in the hyperthreaded case each hyperthread must be running 57.5% of normal speed - barely half as fast. That's just math.

This slowdown of each thread may not matter, or it may lead to increased latency.

TL;DR - when both hyperthreads in a core are in use it is highly likely that each one is running significantly slower than if only one was in use.


On AMD, yes, but Intel Hyperthreading is not real SMT.


Nope, all depends on type of load.

HT is an opimization to context switching but if your code running on both utilizes different parts of core (say one loads some memory while others does some math), you can get speedup that's above savings from not having to context switch.

But early on it wasn't really that well optimized so you might've had OS put 2 threads on 2/1 HT/core while other core sits idle


4 wheel drive is also not real, because under the hood they share the same engine.


Only on those ancient internal combustion computers.


Most of my freezes on Ubuntu for me are the complete grind-to-a-halt kind, and they happen because I’ve run out of memory. Then I buy more memory.


On the other hand I just installed Chrome Flex OS on a 2011 HP laptop with 4 GB of memory and am in shock at how responsive it is.


Thermal throttling, dirty fans, poor process isolation and scheduling, and so forth.


Why don’t computer architectures have a separate CPU dedicated to the OS/GUI?


This will still not help when the UI CPU is busy-waiting for the application CPU to generate and send the contents of the menu after you click.

And if you can asynchronise separate CPUs, you can do it in separate threads on one CPU.


24 cores CPU and the guy is still in a wordpress domain name ?


Windows?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: