What's inconceivable about the lyrics being generated?
A single song's lyrics easily fits into the context window of virtually any LLM, and with some of the bigger and better ones (Opus) you could probably feed it every known song's lyrics in your desired genre before asking it to create a new set of verses.
The lyrics it generated are way to clever. I'm a fan of rap, and no LLM I've ever used can generate anything nearly as witty as some of the lines in the demo.
I'd agree i'd imagine they had someone write the lyrics for it, but thats perfectly fine we want, driveable models that we can tell it what to say and it properly coherently turns that shit into smooth words with transitions, hell you can hear the breath sounds its even simulating on the word transitions.
I've tried Suno and it was nice. Haven't tried this, but it looks it also is missing what I care about. What is missing IMO is the ability to control the tune. If I could record me humming the tune to produce a cappella music or me singing a song to generate the full mp3 of the song with accompaniment, etc., that would be amazing! AI-generated tunes are hit or miss, especially if the lyrics don't rhyme, and AI-generated lyrics are uniformly bland.
If they want higher quality work and not just trolling "make a song about how @octopoc is fat and needs to get a job" level of creation, they need to allow more inputs into the process.
Anyone have a link to the tweets with examples? X is such a shitty way to showcase your product as I cannot infact see the thread without an X account.
I feel like these tools need a builder style design... explain a drum beat get that set up... then "add in X with Y style" .... then add in "lyrics" and overlay and mix them as you go that wya you can get each part right.
https://x.com/flavioschneide/status/1788654503790628935