> the speed is close to native The numbers I seen suggest wasm is 3-5x slower th...

azakai · on July 13, 2022

That estimate is way off.

Wasm overhead is somewhere in the range of 50% (1.5x slower) to 14% (1.14x slower). Sources:

* https://www.usenix.org/system/files/atc19-jangda.pdf

* https://kripken.github.io/blog/wasm/2020/07/27/wasmboxc.html

It is true you can find a specific benchmark where wasm is 3-5x slower, say if the original uses highly-tuned x86 SIMD or relaxed atomics (or if the wasm version has SIMD or threads turned off entirely). But in general, the overhead is much lower.

SotCodeLaureate · on July 13, 2022

> if the original uses highly-tuned x86 SIMD

BTW, if there is a problem with the current WASM performance, it's probably SIMD.

Again, based solely on my own benchmarks, exclusively in the area I'm interested in (game-type workload, so, say, multiplying many small matrices, performing geometric tests), there is little to no speed-up from wasm-simd128 (where it is supported), whereas native code compiled from the same sources, by the same compiler, profiled on the same machine, seems to be running a bit faster when vectorized.

ArrayBoundCheck · on July 13, 2022

I'm not sure which implementation browsers use but only one engine wasn't on average < 3x slower https://medium.com/wasmer/benchmarking-webassembly-runtimes-...

azakai · on July 13, 2022

(As I said in another comment, those are 4 microbenchmarks. The links I provided contain both real-world codebases and also large sets of varied benchmarks.)

ArrayBoundCheck · on July 13, 2022

Varied? You're joking they're all math

This guy found sqlite to be 19x slower. https://ricomariani.medium.com/wasm-sqlite-for-web-products-...

azakai · on July 13, 2022

SQLite is one of the many benchmarks covered in one of my 2 links. The slowdown there is just under 50%.

(SQLite stresses not just wasm but also other Web APIs, like storage, which might explain different results in different types of benchmarks.)

ArrayBoundCheck · on July 13, 2022

Curiosity is getting to me. Why on earth would you use wasm outside of a browser instead of a container like most people?

When you said I was wrong I knew right away you either never actually tested or you're talking about outside of the browser. If it wasn't clear I was talking about wasm in chrome/safari/firefox this entire time

azakai · on July 13, 2022

> When you said I was wrong I knew right away you either never actually tested or you're talking about outside of the browser.

No, I actually was talking about the browser.

And I have actually tested. You'll find my name on a bunch of public benchmark results about wasm and its predecessor asm.js, all in a Web context,

* Various benchmark posts on Mozilla Hacks: https://hacks.mozilla.org/author/azakaimozilla-com/

* The original wasm paper: https://people.mpi-sws.org/~rossberg/papers/Haas,%20Rossberg...

* The WasmBoxC post: https://kripken.github.io/blog/wasm/2020/07/27/wasmboxc.html

And specifically SQLite is a benchmark that I've looked at many times over the years. It's an important codebase. That's why I added it to the Emscripten benchmark suite, and why it is measured in that last link.

(Fyi, I am one of the co-creators of WebAssembly, I created Emscripten, and I have been working in the compile-stuff-to-the-Web space for over a decade.)

ArrayBoundCheck · on July 13, 2022

There's two things I always wanted to know

1. Why wasn't realloc part of the spec? When I looked at wasm and emscripten it appeared that realloc was essentially malloc+memcpy? That'd be painful when arrays are bigger than L2 or L3 cache. I think I heard arm cpu's all standardized to have virtual memory in 2005 or 2010 so I don't think arm hardware was an issue?

2. It seems like 100% of system calls would have to be implemented through javascript? Why wasn't there some things wasm implemented itself like a way to get time or rdtsc, realloc and other very common operations?

azakai · on July 13, 2022

In both cases the idea was to start simple in wasm 1.0 and leave those things for later. Both are being discussed now.

For the first see e.g. https://github.com/WebAssembly/design/issues/1397

For the second, WASI is doing that on the server, while on the Web we just haven't found a way that is better/faster than calling through JavaScript (we tried to do direct WebIDL bindings among other things).

SotCodeLaureate · on July 13, 2022

> sqlite ... 19x slower

But is this a failing of wasm or the system glue layer, that for something like sqlite/web has to go through js/browser/etc?

Yes, I understand the importance of real-life tests like this, but how one is expected to go about testing wasm performance as such other than using benchmarks that are mostly arithmetic/logic and not syscall-type code?

ArrayBoundCheck · on July 13, 2022

How you use it in real life has to be taken in consideration. From memory system like calls go through JS and there's no realloc. I didn't want to say it in my first comment because I notice the more someone writes the more likely others will argue. I rather someone else post the SQL benchmark but apparently everyone drank the kool aid and had no idea wasm isn't that fast. I certainly would rather use JS than write c++/rust. I wouldn't be surprised if that list of companies using wasm only use it for one or two large projects then stuck to js because of how much work wasm is

SotCodeLaureate · on July 13, 2022

> 3-5x slower than native

Depends on workload of course, but seems too pessimistic with the current state of WASM support. At least this is not what I'm seeing with the things I'm working on (graphics-related), when comparing to native builds.

See this test for example, native seems to be only 50% faster: https://old.reddit.com/r/WebAssembly/comments/vjxtv4/webasse...

cdelsolar · on July 15, 2022

I have run benchmarks directly with the actual codebase running on my computer and then the WASM version and the WASM is about 70-80% of the speed.

Sadly, when the analyzer was written in Go, I was getting around 20-25% of the speed after compiling to WASM. I believe it is largely due to that code being very inefficient about allocations, and I think WASM doesn't like allocations/deallocations very much.

Kranar · on July 13, 2022

The numbers you've seen suggest you're making things up.

ArrayBoundCheck · on July 13, 2022

Why did mods delete the post with the link showing this commenter is the one in fact making things up?

ArrayBoundCheck · on July 13, 2022

[flagged]

azakai · on July 13, 2022

Those are 4 microbenchmarks. There are much larger and more realistic workloads benchmarked in other places (see https://news.ycombinator.com/item?id=32084768 for links), with very different results.