More

pixelesque · 2026-04-18T21:30:05 1776547805

> Even the latest CPUs have a 2:1 fp64:fp32 performance ratio

Not completely - for basic operations (and ignoring byte size for things like cache hit ratios and memory bandwidth) if you look at (say Agner Fog's optimisation PDFs of instruction latency) the basic SSE/AVX latency for basic add/sub/mult/div (yes, even divides these days), the latency between float and double is almost always the same on the most recent AMD/Intel CPUs (and normally execution ports can do both now).

Where it differs is gather/scatter and some shuffle instructions (larger size to work on), and maths routines like transcendentals - sqrt(), sin(), etc, where the backing algorithms (whether on the processor in some cases or in libm or equivalent) obviously have to do more work (often more iterations of refinement) to calculate the value to greater precision for f64.

omoikane · 2026-04-18T23:50:00 1776556200

> the latency between float and double is almost always the same on the most recent AMD/Intel CPUs

If you are developing for ARM, some systems have hardware support for FP32 but use software emulation for FP64, with noticeable performance difference.

https://gcc.godbolt.org/z/7155YKTrK

kimixa · 2026-04-18T23:09:27 1776553767

> ... if you look at (say Agner Fog's optimisation PDFs of instruction latency) ...

That.... doesn't seem true? At least for most architectures I looked at?

While true the latency for ADDPS and ADDPD are the same latency, using the zen4 example at least, the double variant only calculates 4 fp64 values compared to the single-precision's 8 fp32. Which was my point? If each double precision instruction processes a smaller number of inputs, it needs to be lower latency to keep the same operation rate.

And DIV also has a significntly lower throughput for fp32 vs fp64 on zen4, 5clk/op vs 3, while also processing half the values?

Sure, if you're doing scalar fp32/fp64 instructions it's not much of a difference (though DIV still has a lower throughput) - but then you're already leaving significant peak flops on the table I'm not sure it's a particularly useful comparison. It's just the truism of "if you're not performance limited you don't need to think about performance" - which has always been the case.

So yes, they do at least have a 2:1 difference in throughput on zen4 - even higher for DIV.

pixelesque · 2026-04-19T10:19:05 1776593945

Well, maybe not all admittedly, and I didn't look at AVX2/512, but it looks like `_mm_div_ps` and `_mm_div_pd` are identical for divide, at the 4-wide level for the basics.

Obviously, the wider you go, the more constrained you are on infrastructure and how many ports there are.

My point was more it's very often the expensive transcendentals where the performance difference is felt between f32 and f64.

adgjlsfhk1 · 2026-04-19T00:01:22 1776556882

This depends largely on your operations. There is lots of performance critical code that doesn't vectorize smoothly, and for those operations, 64 bit is just as fast.

kimixa · 2026-04-19T02:15:54 1776564954

Yes, if you're not FP ALU limited (which is likely the case if not vectorized), or data cache/bandwidth/thermally limited from the increased cost of fp64, then it doesn't matter - but as I said that's true for every performance aspect that "doesn't matter".

That doesn't mean that there are no situations where it does matter today - which is what I feel is implied by calling it "Ancient".

pixelesque · 2026-04-12T10:12:02 1775988722

No it doesn't - it allows a single writer and concurrent READs at the same time.

InfraScaler · 2026-04-12T10:20:08 1775989208

Thanks! even I run a sqlite in "production" (is it production if you have no visitors?) and WAL mode is enabled, but I had to work around concurrent writes, so I was really confused. I may have misunderstood the comments.

yomismoaqui · 2026-04-12T10:45:45 1775990745

Writes are super fast in SQLite even if they are not concurrent.

If you were seeing errors due to concurrent writes you must adjust BUSY_TIMEOUT

InfraScaler · 2026-04-12T14:36:00 1776004560

Thanks I'll have a look. For now I just had a sane retry strategy. Not that I have any traffic, mind you :-)))

pixelesque · 2026-04-11T15:11:57 1775920317

One reason is Low Probability of Intercept radars (and transmitters / datalinks) do exist, and are very difficult (but not impossible) to identify and locate.

pixelesque · 2026-04-08T06:19:00 1775629140

$2 split between Iran and Oman...

pixelesque · 2026-04-06T17:35:46 1775496946

They're discussing how to manage SD cards, and Houston wants them to sign in and out (by initialing in MS OneNote!) every time they change windows.

pixelesque · 2026-04-02T07:14:34 1775114074

Exactly the same situation with me in terms of gmail address (although my names are less common).

I get so many other $MY_NAME emails, including bills (including multiple credit cards and things like Afterpay), deliveries, medical details/reports, family communications, etc, etc.

And it's very clear that quite a few online services blatantly don't verify email addresses, they just assume the email is valid and allow the person to start using it.

pixelesque · 2026-03-24T21:00:39 1774386039

Possibly a combination of moving infrastructure to Azure, and also a significant increase in the number of PRs and commits due to Vibe-coding?

cyanydeez · 2026-03-24T21:40:57 1774388457

Perhaps staff cuts having longtails? https://www.itpro.com/software/microsoft/microsoft-layoffs-h...

pixelesque · 2026-03-12T10:14:34 1773310474

It comes down to how "coherent" the rays are, and how much effort (compute) you want to put into sorting them into batches of rays.

With "primary" ray-tracing (i.e. camera rays, rays from surfaces to area lights), it's quite easy to batch them up and run SIMD operations on them.

But once you start doing global illumination, with rays bouncing off surfaces in all directions (and with complex materials, with multiple BSDF lobes, where lobes can be chosen stochastically), you start having to put a LOT of effort into sorting and batching rays such that they all (within a batch) hit the same objects or are going in roughly the same direction.

pixelesque · 2026-03-03T14:17:31 1772547451

Interesting that they're showing VFX/CG software (Autodesk MAYA and Foundry Nuke) so prominently - obviously people using "Pro" machines are the target audience for this, but both of those apps (any many others in the industry) use Qt for the interface, rather than being totally platform-native.

trymas · 2026-03-03T15:05:06 1772550306

Similar thoughts with first image of Capture One, when apple bought Pixelmator/Photomator a year ago.

I think I read somewhere long time ago that Capture One is also using Qt for GUI, though cannot find this anymore, so probably not true.

klabb3 · 2026-03-03T14:51:37 1772549497

Contrary to HN popular belief, there are neither incentives nor benefits to building native ui apps, for neither consumer nor professional apps. The exception is apps that only make sense on a single platform, such as window management and other deep integration. On iOS/macos you have a segment of indie/smaller apps that capture a niche market of powerusers for things like productivity apps. But the point is it makes no sense for anything from Slack, VSCode, Maya, DaVinci Resolve, and so on, to build native UIs. Even if they wanted to build and maintained 3 versions, advanced features aren’t always available in these frameworks. In the case of Windows, even MS has given up on their own tech, and have opted to launch webview based apps. Apple is slightly more principled.

dagmx · 2026-03-03T17:33:02 1772559182

Qt delegates to native UI in a lot of cases. I think a lot of people who rail against native UI fail to delineate between native UI and first party frameworks. Using third party frameworks, even cross platform ones, does not mean you lose out on native UI elements.

trymas · 2026-03-03T15:14:37 1772550877

I am not an apple framework expert, but some things in apple ecosystem are nice.

CoreImage - GPU accelerated image processing out of the box;

ML/GPU frameworks - you can get built-in, on device's GPU running ML algorithms or do computations on GPU;

Accelerate - CPU vector computations;

Doing such things probably will force you to have platform specific implementations anyway. Though as you said - makes sense only in some niches.

NetMageSCW · 2026-03-03T15:37:17 1772552237

Strong disagree. I think Microsoft’s decision to wrap web apps for the desktop is one of the stupidest they have ever made. It provides poor user experience, uses more battery power and needs more memory and CPU to be performant and creates inconsistencies and wierd errors compared to native apps.

cosmic_cheese · 2026-03-03T17:37:29 1772559449

The increased adoption of webviews has resulted in a death by a thousand cuts effect on Windows 11 performance. The speed bump that comes from going from an up to date Windows 11 install to a up to date Windows 10 install on the same machine is stunning… W10 is much more snappy in every regard despite being nearly identical functionally speaking.

I won’t try to claim that Electron and friends have no place is software development but we absolutely should be pushing back harder against stuffing it everywhere it possibly can be.

bigyabai · 2026-03-03T17:50:27 1772560227

> but we absolutely should be pushing back

Every modern desktop uses webviews in some capacity. macOS renders many apps with webviews, GNOME uses gjs to script half the desktop. The time to push back was 10-20 years ago, it's too late to revert now.

cosmic_cheese · 2026-03-03T20:35:56 1772570156

They’re still fairly uncommon in macOS, mostly being used in places related to cloud service settings. SwiftUI and Catalyst (iOS bridge) are both much more common than webviews, and AppKit remains ubiquitous.

Meanwhile on Windows major features like the Start menu are written in React.

Worth noting that WebKit webviews also tend to be more lightweight than their Chromium brethren.

jech · 2026-03-03T23:34:18 1772580858

> GNOME uses gjs

I don't think gjs is a webview. It uses JavaScript, granted, but binds to a native toolkit, not to DOM and CSS.

pixelesque · 2026-03-03T13:28:40 1772544520

Yeah...

I can't use Mullvad for several banks in the UK with IPv4 - if I switch to IPv6 in the app settings I sometimes can, but often I have to just disable it completely...

I can't use Youtube anonymously (i.e. without logging in) within the last month or so either, as Youtube very often won't play content due to my IP as well...