Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Improved freemusicdemixer – AI music demixing in the browser (freemusicdemixer.com)
160 points by sevagh on Sept 14, 2023 | hide | past | favorite | 40 comments
Hi HN,

Last time I showed free-music-demixer, which people seemed to enjoy. It was a static website with a Javascript + WASM module to perform music demixing (or music source separation) using an AI model UMX-L (Open-Unmix) running client-side in the browser.

Since then, I have overhauled the project and made several improvements:

- The demixing/separation quality is higher now, since I implemented the missing post-processing step

- Memory usage is lower now by performing a custom segmented inference with a streaming LSTM, which should allow larger tracks (or, dare I say, arbitrarily-large tracks)

- There is a batch upload feature now to demix an entire folder of songs (and provide zip files of the stems)

- There are now dev logs printed to the website to show the progress better




Man, this is fucking awesome. Thank you for building this!

Not just free, but local, without installation, and generalized to common use-cases too? This is a standard of development that I aspire to, so thank you for being a great example!

Dev appreciation aside, I also record music with my long-distance buddy, so we often find ourselves trying to use midi recreations in order to get at least a passable version of timing and range that we can both practice from until the demo compositions have been laid down. It's often pretty far off the mark from the original track, so we will make fantastic use of this utility. Again, thanks so much!


"Runs locally in your browser!

Unlike similar products, it’s free to use and doesn’t store your data. All processing is done in your browser, and your files are never uploaded anywhere. It runs well on computers and very slowly on smartphones; user beware."

I'm impressed by the fact that this was the choice made. I can only imagine it also helps keep the operational costs down, as well as liability for copyright and what not since they never become in possession of the content. However, it also means they "lose out" on a possible continuous source of training data which other less ethical evilCorp type companies would not pass up on


I'll be honest with you - I wouldn't even know what to do with user data if I collected it. I'm not great at training neural networks or understanding how to augment training with additional data.

Basically, the _only_ special thing about freemusicdemixer is that it runs client-side, because I'm better at writing C++ for a pre-trained network than I am at training new neural networks. However, it's a cool advantage so I'll keep promoting it as the distinguisher of my "product" (since I didn't create or train the underlying neural network, I just consider it a clever WASM frontend).

The operational costs are precisely $0 since it's a static website (aside from the domain registry).


Keep that mindset. Please. User data shouldn’t be your product. This is perfectly acceptable and thank you for doing this.


It is very nice to be able to compute this in our browsers without limits or registration. The results are not always perfect but already allows having good time testing remixes or doing karaoke with friends! Thanks for sharing your work.


I envisioned my site would be useful for quick and dirty prototyping - especially since the outputs are not licensed for commercial use.

Then, when one's project idea is validated, they could then used a paid stem separation service with higher quality and commercial licensing.

That's why I added some sentences on the website to solicit partnerships for targeted advertisements, to see if any pro demixing companies were interested.


That's an interesting idea. Scratch track stems while we're waiting for the approvals and delivery of final assets.


For those interested, Facebook's Demucs page (https://github.com/facebookresearch/demucs) gives performance comparison for several models including open-unmix.

See also: https://www.stemroller.com This runs as a local app on Windows and Mac.


My dream (and next major project) is to get Demucs (v4, Hybrid Transformer) running in WASM, similar to what I've done with Eigen/C++ and Open-Unmix.


Oh yes, that would be amazing! Good luck!


By my own little comparison in which I've given the same song to both, Demucs appears much better (while requiring an app and consuming far more CPU).

Thanks for sharing! I had no idea this was a thing.


demucs works far better than anything else I've used, especially with more esoteric kinds of music in my experience. plus you can run it with GPU support as well!


Demucs is awesome no doubt.

>plus you can run it with GPU support as well!

Open-Unmix also originally runs on the GPU like it was intended for, since it uses PyTorch just like Demucs.

I'm curious about using WASM + WebGPU to add GPU acceleration to this project, though.


Oh my god, stemroller.app is 1.85 GB (weights only 350 mb) electron python pytorch ffmpeg and 10 MB webapp. This is state of modern app development. But it works, nonetheless.


freemusicdemixer would endlessly try to download files and never actually load for multiple minutes, though it might be my Wi-Fi. demucs seems to work acceptably on music, but transients tend to get lost unless you unmute the drums channel (which is probably unavoidable whenever drum and tonal transients coincide in time and occupy similar frequencies).


Sorry about that - could you put the contents of the developer console or the symptoms in a github issue (or describe it here?)


after finally getting some free time to play with this, i did. i've used other non-AI based programs to remove vocals. the in sound is totally different and obviously beyond because they are stems. There's the tell-tale sound that I've heard from all of these generative AI audio things that sounds like a very heavily compressed version of the original. To describe it visually, it looks just like taking an image and removing part of the image and then using a generative fill. Compare how the fill looks to the rest of the image, and that's what this sounds like. Similar, but just smeared and not clean. Maybe it's not noticeable to people that don't work with audio, but it is one of those things that once you're trained to hear it, you can never not hear it.

it is probably the most useful application of the AI things I have seen AND it does as advertised on the box. nice work on the project.


Thanks for the kind words!


Maybe I can ask HN out loud how I can start figuring out how to get this website on Google search results -

e.g. I want it to show up when I search relevant terms like "free stem separation" "free music separation" "free music isolation" "music stem separation" "music source separation"


Add meta keyword and description tags and detailed text (HTML Tags) to each page you want indexed and post your app page URLs across social media sites, then give it a little time, indexing, especially among lots of competition won't happen instantly. :/


While this might seem to be intended for DJ remixes a demixer also lets you jam along to your favorite songs. For instance you can extract only the voice, bass, and drums and play along with your guitar like you were a part of the band.


> While this might seem to be intended for DJ remixes a demixer also lets you jam along to your favorite songs.

Not to mention arranging or just notating songs.


I frequently use a similar online demixer and am always worried that they will eventually remove the free option, so was excited to try this one out. When uploading couple different 2 minute song mp3s it didn't work and would always produce 5 identical wavs that just contain some audio pop/click sounds. It worked when i cut out a 30s section of the song. So some indicator if the process failed or not or what are the length limits would be appreciated. Still nice work, will be keeping an eye out for any updates to this, keep going!


It didn't load weights for me, So I tried umx.cpp but it failed to compile on mac.

I would recommend to polish umx.cpp and put converted weights on huggingface, make it as effortless to use as whisper.cpp etc.


Can you create a github issue? Dod anything show in the developer console?

I could (probably) improve umx.cpp but freemusicdemixer.com is supposed to be the "effortless" compile-free version of it.


This is great, thank you. I've been using another site to do this, but it requires upload to the cloud.

I use it for removing background music from movie clips so I can remix them and add alternate background.


Just tried this and it seems to get stuck at the following step (for 20 mins so far)

[WASM/C++ 18:07:24] Getting waveforms from istft [WASM/C++ 18:07:25] Copying waveforms


Hmm - that should be the end of it, at which point download links will appear to either the Single or Batch apps, depending on which one you used:

- bass, drums, vocals, other, karaoke.wav in the Single track window (at the bottom: Demixing outputs...)

- song_1.zip, song_2.zip, ... in the Batch window (at the bottom: Batch outputs...)

Like so: ``` Demixing outputs... bass.wav drums.wav other.wav vocals.wav karaoke.wav ```

There should also be a Javascript message on the left like "Preparing zip" or "Preparing stems for download"


ah out of memory, I'll try with a shorter track.


Oh no! Can you open a github issue with the size of track you tested? Or email the track to me at sevagh@protonmail.com

I can make the segment size smaller (right now I'm using 60s segments).


I got same, but have 40900 and 32 Gig.

On multiple files.


it is a Chrome problem, same machine it runs fine on Firefox.


Is this the best possible demixing you can get? Or did you have to use a smaller / lower quality model to make it run in browsers?


In my first post, quite a lot of alternatives were discussed: https://news.ycombinator.com/item?id=36707877

The model I'm using is called Open-Unmix (https://github.com/sigsep/open-unmix-pytorch). In 2021, there was an update to Open-Unmix to include new weights, UMX-L, which made it perform better than it used to on the older weights (UMXHQ).

In the grand landscape of music demixing, I don't think UMX-L is near the top anymore.

_However_, the demixing performance of freemusicdemixer.com is very close to the full PyTorch performance of Open-Unmix UMX-L, despite the tricks I needed to get it working in the browser, such as splitting up the inference to operate on segments of the song, or making the LSTM operate on streaming segments rather than holding the entire track in the LSTM memory.

In my first release, I loaded and did inference on the entire track at once (like the PyTorch model), which frequently crashed or exceeded the 4GB WASM memory for medium or large-size tracks.


It's super exciting to see people using open-unmix like this! I worked with the creator of the model to try to do the same as a university project! Our solution was... not great, to say the least, but I'm happy someone else managed to do it!


That's awesome! I first got in contact with Fabien-Robert (https://github.com/faroit) during the Music Demixing Challenge 2021, where the UMX-L weights were first unveiled.

We have since discussed my projects a couple of times! I even got the idea for a streaming LSTM from him.

I think music demixing in general owes a lot of thanks to Open-Unmix and co (https://github.com/sigsep), who have relentlessly been publishing open-source models and related code (source separation metrics, dataset loaders, etc.) for years, and who blew the industry open with their MDX 21 [1] and SDX 23 [2] AI challenges.

[1]: https://www.aicrowd.com/challenges/music-demixing-challenge-...

[2]: https://www.aicrowd.com/challenges/sound-demixing-challenge-...


Amazing work

Would love a VST frontend as well so I could drop a song file into a plugin and have it spit the stems out directly into my DAW


Labeling the 4 parts Vocal | Melody | Bass | Drums, instead of Drums | Vocals | Bass | Other would go a long way to making it seem less programmer-y.


Nice idea! The name 'other' comes from the traditional datasets used to train these models, but I'm open to trying to incorporate a better name.


Are there any tools that will add musical accompaniment to a vocal track? Basic strings/piano?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: