Hey all, Nick from Elementary Audio here. I spent some time in the discussion below yesterday and coming back to it again now I can see this continued to get way more attention than I expected! Thanks for all of your thoughts.
There's a lot of discussion here, and although I'd love to visit every comment in specific I think it makes sense to speak in summary to a few of the major themes/underlying questions here. In general, the discussion here surrounds totally valid concerns that I'll try to cover, and I understand after this thread that my messaging needs some serious work to speak to those concerns.
Before I get to those questions, I want to start with some of the things that Elementary is _not_ well suited for, or some areas where Elementary currently has some limitations.
* It's a young project; there are plenty of standard library features that are not implemented yet (oversampling, several more filter topologies, iFFT, etc). That might mean Elementary is currently ill-suited for your project
* It uses a generalized, block-based graph rendering algorithm. If you have hard performance requirements or need highly specific and hand tuned DSP optimizations, Elementary is currently not ready for you.
* Because of the current rendering algorithm, you cannot specify signal flow graphs that have 1-sample feedback loops.
With that out of the way, let's talk through some of the questions here that seem to underly many of the comments.
1. Why JavaScript?
* It's inarguably one of the most widely used languages in the world
* It's more than fast enough for the role that it plays in Elementary
* It's the language of the web; which isn't going anywhere any time soon, and at the same time the industry is showing us demand for audio software that runs in the web.
* JavaScript/web tech has pioneered the way that we write app UIs over the last decade, and continues to do so
* JavaScript/web tech has pioneered oen of the best developer experience/workflows in software development
2. How can JavaScript possibly be fast enough for this?
* Elementary uses JavaScript only to _describe_ audio processes, not execute them
* All of the actual audio processing is handled in the underlying C++ audio engine
* This engine is compiled to native, platform-specific targets in the appropriate cases (i.e. plugins, cli), and only compiled to WASM when appropriate for running in the browser
* The description of the audio process is done with an extremely lightweight virtual representation for which JavaScript is more than fast enough, and for which garbage collection and seamless integration with application logic and user interface programming is desirable
3. What about latency?
* The core audio engine adds no audio latency overhead whatsoever; if you're running a platform/native target, you're running with the same latency that your drivers/plugin host are delivering to Elementary. Until, of course, you describe an audio graph which introduces new latency; that's on you.
* When compiled to WASM to run in the browser, of course there's additional implications here, but that's a reality of targeting the web in the first place. Elementary doesn't force you to the web, or force you to wasm, but if you want to run in the web, accepting the latency implications yourself for your own apps, Elementary can go there with you
4. What about bottom line performance? Professional, commercial software??
* Elementary will not make the claim that for executing a static graph it will outperform native C/C++ hand-tailored to the application demands. That would be crazy; it's probably fair to say that highly specialized code can always outperform generalized algorithms.
* Elementary will make the claim that for plenty of applications, generalized graph processing is surely fast enough.
* The generalized graph processing algorithm is fast enough because the audio processing is purely native (or compiled to wasm); we do not execute any JavaScript itself on the audio processing thread
5. What about Faust/SuperCollider/PureData/ChucK/Max...?
* I have direct experience with many of these languages, and have reviewed almost all of the others in detail. I think they're excellent pieces of software for prototyping and exploring. I also think they're generally weaker for shipping complete applications.
* Of course you can use Faust/Max Gen or similar to generate C++ and then bolt on your own application harness, and I'd encourage you to try. For non-trivial applications, interfacing between the UI and the DSP becomes cumbersome. Elementary aims to provide a better experience here.
6. Why do any of this at all?
The current dev process and workflow for writing audio software has stagnated; it takes so much time to turn a prototype into a product and we regularly accept complexities and limitations that for so many types of apps we no longer need to accept. Elementary offers a new way of prototyping and shipping those apps more easily and more quickly.
Thanks again for all the discussion, I hope this clears up some of the details.
I couldn't agree with this more. Even beyond the web audio API, we have similar graph APIs in the low level/native C++ world that become unmanageable in their complexity simply due to an inability to compose and manage transitions with ease.
This is exactly one of the difficulties that Elementary aims to address, whether you're running in the browser (where indeed, Elementary uses almost none of the actual web audio api), or natively.
Also, as another user here has already pointed out, the "plugin development kit" is still macos only, though I anticipate releasing the windows version within hopefully a week or two.
You're totally right that in this domain you have to be extremely careful with performance and latency. Elementary takes great care to deliver that; it's a JavaScript "frontend" API but all of the actual handling of audio is done natively with high quality realtime constraints. The core engine is all native C/C++, and we run in the web by compiling to wasm and running inside a web worker. For delivering a desktop app or audio plugin, the native engine is compiled directly into the target binary.
So, while JavaScript and garbage collectors can bring a performance concern, we're only using JavaScript there for handling the lightweight virtual representation of the underlying engine state, and for that role JavaScript is plenty fast enough and the garbage collector quite helpful!
That's great on paper but how do I actually use it? If I want to play back a wavetable you have el.table() but it's missing the oversampling and decimation tools required for antialiasing. The el.lowpass() is a biquad which is not suited well for modulation. How can this compete with JUCE when the documentation and features are so sparse?
You should really fix the initial impressions you make from the site to get that front and centre though. Experienced audio devs are going to dismiss this unless you make it clear very quickly.
Ah cool, after I posted I was wondering if this was the case. I was just thinking "maybe the renderer is in WASM"?
That's pretty cool, it will be interesting to watch. I do something very vaguely similar in my computer music work where Scheme orchestrates C level objects.
Personally, I wouldn't want to use JS as the high level orchestrator myself as it's such a dog's breakfast of a language, but I can see that many might.
This is my new favorite comment for illustrating the perilous future of general computing and how it needs to be taken away from JavaScript if we have any chance of survival. Electron, Webassembly fetishism, the pursuit of language synchrony at the expense of common sense, it all gets you this. This comment. Right here. This is the future of software and it should scare the shit out of you.
Let me get this straight: you realized latency was a concern, so you wrote in C/C++ (which, exactly?), then turned it into wasm so you can run it in a Web worker? What the hell was the point of that? That’s like saying you bought an M6 and converted it to a foot-powered bicycle. What exactly do you think wasm does? You seem to be implying that you think the native engineering you invested in continues to matter, in the same way, after you do that. You also imply heavily that you understand the wasm you’re executing to still be native. Do you think that? Do you understand what you’re giving up in the Web worker? As in, directly tied to latency and real-time requirements, your whole reason to go native in the first place?
Whatever your response is, deep down, you and I both know it’ll be justification to do Web things for Web’s sake. I know this because everyone I’ve had this discussion with has played the same notes (imagine the deployment model!) while failing to understand that they’re justifying their preference. The only people who build Web stuff want to build Web stuff. In the high performance sphere, of which DSP is a very mature practice, this looseness with the fundamentals of software is going to put you out of the game before you’re even started.
I'm a web person defending web things but providing something on the web has a significant advantage.
I know you've heard this again and again, but I can't emphasize it enough.
You can use the site from most platforms, including PCs and mobiles.
You don't have to install software, a single click is enough.
Of course, browsers have considerable limitations, and serious users will eventually choose other tools, but providing such an accessible software is a really huge advantage for me.
> You can use the site from most platforms, including PCs and mobiles. You don't have to install software, a single click is enough. Of course, browsers have considerable limitations, and serious users will eventually choose other tools, but providing such an accessible software is a really huge advantage for me.
In the audio world that is cool for toys.
There are a lot of really cool things to play with in the browser/multi media space, maybe this is another.
But when it comes to getting shit done, writing plugins and transports, creating instruments and rigs for live use (my hobby these days) the quite demeaning comments here are on the mark.
This "Elementary" is prioritising the wrong things. And there are a lot of frame works that do not require you know C or C++
Pure Data, and Sonic Pi are two I have played with (the former much more than the latter).
Platform independence is simply not an issue when building these systems. Installing software is not an important issue.
Sorry. This is, on the face of it, a waste of time. I hate saying that, but if it were me doing this I would pitch it as a fun toy.
Do you understand how WASM and Web workers work? Do you understand that low-enough-latency audio doesn't take a super computer anymore? Yeah, if you were working on DSP stuff in the 1990s, you were a hot shit programmer. Nowadays, it doesn't really say much at all. And it certainly doesn't justify talking about it as if it were a moral failure to not treat DSP with respect.
> Do you understand that low-enough-latency audio doesn't take a super computer anymore
It never did. Low latency audio has almost nothing to do with CPU power. Here's a summary of some of the issues faced on modern general purpose computers:
I know how WASM and Web workers work. Since nothing you can do in WASM or a web worker has anything to do with either (a) realtime scheduling priority (b) actual audio hardware i/o, they don't have much to do with solving the hard parts of this. Browsers in general do not solve it: they rely on relatively large amounts of buffering between them and the platform audio API. Actual music creation/pro-audio software requires (sometimes, not always) much lower latencies than you can get routing audio out of a browser.
And even when we set it, we don't get it, because we blithely read a "latency" label in a GUI instead of measuring the round-trip latency on the specific device in question.
That wouldn't be correct either, at least half the time. Problem is that "latency" is used with different meanings, at least two:
1. time from an acoustic pressure wave reaching a transducer (microphone), being converted to a digital representation, being processed by a computer, being converted back to an analog representation and finally causing a new acoustic pressure wave care of another transducer (speaker).
2. time between when someone uses some kind of physical control (mouse, MIDI keyboard, touch surface, many others) to indicate that they would like something to happen (a new note, a change in a parameter) and an acoustic pressure wave emerging somewhere that reflects that change.
The first one is "roundtrip" latency; the second one is playback latency.
I have indeed! I've used Faust quite a bit over the past few years, and it was very much an inspiration for Elementary. In general I wanted a tool that meant I could think and work on my dsp in the same way that I think and work on my UI, and with the same language even.
There's a lot of discussion here, and although I'd love to visit every comment in specific I think it makes sense to speak in summary to a few of the major themes/underlying questions here. In general, the discussion here surrounds totally valid concerns that I'll try to cover, and I understand after this thread that my messaging needs some serious work to speak to those concerns.
Before I get to those questions, I want to start with some of the things that Elementary is _not_ well suited for, or some areas where Elementary currently has some limitations.
* It's a young project; there are plenty of standard library features that are not implemented yet (oversampling, several more filter topologies, iFFT, etc). That might mean Elementary is currently ill-suited for your project
* It uses a generalized, block-based graph rendering algorithm. If you have hard performance requirements or need highly specific and hand tuned DSP optimizations, Elementary is currently not ready for you.
* Because of the current rendering algorithm, you cannot specify signal flow graphs that have 1-sample feedback loops.
With that out of the way, let's talk through some of the questions here that seem to underly many of the comments.
1. Why JavaScript?
* It's inarguably one of the most widely used languages in the world
* It's more than fast enough for the role that it plays in Elementary
* It's the language of the web; which isn't going anywhere any time soon, and at the same time the industry is showing us demand for audio software that runs in the web.
* JavaScript/web tech has pioneered the way that we write app UIs over the last decade, and continues to do so
* JavaScript/web tech has pioneered oen of the best developer experience/workflows in software development
2. How can JavaScript possibly be fast enough for this?
* Elementary uses JavaScript only to _describe_ audio processes, not execute them
* All of the actual audio processing is handled in the underlying C++ audio engine
* This engine is compiled to native, platform-specific targets in the appropriate cases (i.e. plugins, cli), and only compiled to WASM when appropriate for running in the browser
* The description of the audio process is done with an extremely lightweight virtual representation for which JavaScript is more than fast enough, and for which garbage collection and seamless integration with application logic and user interface programming is desirable
3. What about latency?
* The core audio engine adds no audio latency overhead whatsoever; if you're running a platform/native target, you're running with the same latency that your drivers/plugin host are delivering to Elementary. Until, of course, you describe an audio graph which introduces new latency; that's on you.
* When compiled to WASM to run in the browser, of course there's additional implications here, but that's a reality of targeting the web in the first place. Elementary doesn't force you to the web, or force you to wasm, but if you want to run in the web, accepting the latency implications yourself for your own apps, Elementary can go there with you
4. What about bottom line performance? Professional, commercial software??
* Elementary will not make the claim that for executing a static graph it will outperform native C/C++ hand-tailored to the application demands. That would be crazy; it's probably fair to say that highly specialized code can always outperform generalized algorithms.
* Elementary will make the claim that for plenty of applications, generalized graph processing is surely fast enough.
* The generalized graph processing algorithm is fast enough because the audio processing is purely native (or compiled to wasm); we do not execute any JavaScript itself on the audio processing thread
5. What about Faust/SuperCollider/PureData/ChucK/Max...?
* I have direct experience with many of these languages, and have reviewed almost all of the others in detail. I think they're excellent pieces of software for prototyping and exploring. I also think they're generally weaker for shipping complete applications.
* Of course you can use Faust/Max Gen or similar to generate C++ and then bolt on your own application harness, and I'd encourage you to try. For non-trivial applications, interfacing between the UI and the DSP becomes cumbersome. Elementary aims to provide a better experience here.
6. Why do any of this at all?
The current dev process and workflow for writing audio software has stagnated; it takes so much time to turn a prototype into a product and we regularly accept complexities and limitations that for so many types of apps we no longer need to accept. Elementary offers a new way of prototyping and shipping those apps more easily and more quickly.
Thanks again for all the discussion, I hope this clears up some of the details.
Cheers, Nick