Since Stable Diffusion and ChatGPT got popular, I'd like to ask what their equivalant are in the music field.
I understand that composing involves a lot of nuances and it's not just to put music notes down and call it a day. But it's still a little hard to believe that it's much more difficult to make an AI that, say, can compose music tracks like https://www.youtube.com/watch?v=OSPkn-iHPWA (Pokemon Sword's Title Screen), than to make ChatGPT.
Is Riffusion(https://www.riffusion.com/) the cutting edge tech? Or am I missing something?
Played with riffusion some, not at all impressed by it. It is mimicking without understanding and while it occasionally did something interesting it has no real comprehension of time scales over short loops, no understanding of larger structures and fails completely on the prompt if it is something it can not easily research. This is about where AI generated music has been for a decade now and can not seem to push past. Part of this is probably because AI was integrated into composition a good long while ago, composers tend to treat it more like an instrument or a filter than something which writes music and most of the work with AI in music is towards those ends, not towards getting AI good at composition.