Yes, it uses solenoid based actuators. Not sure if it controls the dynamics too, but one way to do that would be by driving the actuators using a curve modulated by high frequency PWM; probably quite hard to calibrate to MIDI key velocity since those actuators are essentially on/off devices, but doable.
A very nice project, I'd love to see some more information about it.
Thanks! So - midi velocity translates to PWM and hence solenoid force? Cool. You can always add some kind of mapping function if lower velocities are too low force. Nice work.
(old pre-electronics player pianos could do dynamics but most relied heavily on input from the operator to do so - with various limitations - it's probably a good plan to avoid directly imitating them :-)
Tangentially — apologies if this is the wrong thread for this —, I know this is distinct to AI generated music (the melody here is a one-to-one mapping between the input song and output piano), but I’m curious when folks here think when the first totally generated artist will go mainstream with chart hits and millions of “followers”. With both an AI generated avatar and AI generated music. And no, not like Gorillaz who simply had fake cartoon personas.
If you believe the argument floating around that content we consume in the future will be hyper individualised to the point music/tv will be generated just for us, then maybe never. IMO that runs counter to shared idolatry that people seem to crave, seems to me just a matter of time.
I think the answer is "never", but mostly because the target is both vague and impossibly strict.
Let's take Hatsune Miku. You could argue that "she" fits the bill (3D character, algorithmic voice, lots of followers), but of course there are humans writing those songs and music. If you automated that away (AI lyrics, AI music) you would still have humans checking that the music fits "her" style, that the lyrics make sense, and that the result is not just a racist tirade due to 4chan training data. And as long as those humans are there you can't really say it's 100% AI, can you?
One could argue (wrongly, IMHO) that doing the selection and filtering is not the same as making music, but that criteria would then classify DJs as "not musicians" and I know they hate that.
> And as long as those humans are there you can't really say it's 100% AI, can you?
I don't know how real artists do things, but I suspect that at least some of them rely on other people opinions about their new piece of work before going public with it. Basically for the same reasons. It is hard to objectively judge your own creative work. Especially because your fans do not judge your works objectively. You need to listen some voice of sanity to not loose your connection with reality.
If it doesn't work as an argument, I can propose a thought experiment. Let's imagine a human artist with a mental disability who sometimes allow himself to do some really strange things. To protect him from big mistakes there is a small group of mentally healthy and competent people to filter his works. The question is: can we say that our disabled artist's works are not his works? Or not 100% his works?
> And as long as those humans are there you can't really say it's 100% AI, can you?
I believe it depends. How much of filtering those humans do?
Let a million Mikus bloom unfiltered. The ones that spew out racist tirades or subpar music-noises would presumably be rejected by the consumers (although you never know these days).
I think you will see an approach like mine with a heavy use of AI assistants in creating various elements of the songs first (I will compare 10 melodies I've created with an AI assistant I made to human hit melodies: https://www.youtube.com/playlist?list=PLoCzMRqh5SkFPG0-RIAR8..., https://osf.io/9nd6x). Most existing tools support this approach, and doing a whole song at once does not seem necessary and it would be much less flexible.
I believe there is a potential market where contemporary celebrity laden media is personalized with automated actor replacement. Fans can have the fantasy of being in the media: in the fantasy, sci-fi, superhero and pop music media they already consume. Side by side with their idols, plus further personalizing treatments such as localization and product placements. This may sound Orwellian, but it is also wildly open ended and a creative uncharted territory for narrative story telling. The potential for education is profound.
I've been working on fully automated actor replacement for over a decade now, exploring the aspects and potential of such personalization. I think it has the potential of being an entire recognized separate medium from traditional story telling.
Not until the AI has some degree of agency. So long as it's a human selecting the AI, and pushing the "run" button (and deciding when) to push the button, that generation machinery is just a tool of the human. Pretty much by definition, to be totally generated, it's got to have the agency to determine what, when, and even if, it should generate.
This is pretty neat, but are they actually playable by a single human pianist? There's parts of the demo video where there are 10 simultaneous notes being played over 4 octaves, which doesn't seem humanly possible. Identifying the notes and chords being played is a big step but you've also got to adapt to the limitations of the instrument and figure out how to simplify it to be playable while retaining the same essence. That's a big part of what makes arrangement difficult.
I didn't see catch this part, but having 10 notes over 4 octaves is not out of the question. If all of those notes are rhythmic, that could be a challenge. All of the fairly wide chords I saw could be played either with sustain or could be easily substituted with spread chords.
Pop pianists don't usually play what's exactly on the page.
Very cool. Note that it outputs midi and the demo chose to use a terrible piano sound, there's better free samples they could have used. I threw in a flac of the weirdest (read: most nonsensical melody, nowhere close to pop) song I could think of (Mupp - vendetta) into the collab and it seemed to do pretty well capturing the melody. Composer 1 lost some of the nuance of the melody at 0:50, but composer 14 got much closer.
Having been strongly influenced by Marx & Cannel How to Play the Piano Despite Years of Lessons, I suspect there's a much simpler harmony that would also back the melody.
(that said, IIUC what pop2piano does, it's style transfer for arrangements — so my prediction is it would track the complex harmony, with appropriate realisations of each chord?)
[Edit: > Pop2Piano uses only four-beat length audio for the context of input. Therefore, features such as melody contour or tex- ture of accompaniment have less consistency when generating longer than four-beat. Also, time quantization based on eighth note beats prevents the model from generating piano covers with other rhythms such as triplets, 16th notes, and trills.
2. How would one go about practicing irl to be able to play in the style of the piano covers?
This is the style of piano I would love to be able to perform. I can read lead sheets and know music theory, but I just don't have the hand chops to perform. Every piano instruction book or tutorial I run across is based on developing progressive skills for classical performance. Pop style playing is distinct and I've never known how to progress my skills.
Sk8er Boi is an impressive song to demo. The chorus is in a different key than the verse, and one of the verse chords is out-of-key, borrowed from the chorus.
Does it generate human playable arrangements? I don't even know what that means, but I assume there's some max width fingers can travel or have pressed down at the same time. This is an awesome thing and I can't wait to play with its output in my daws.
This is continued proof to me that the future is art on demand that is instantly created and shared and does not need to be captured in a static form like a youtube video.
Nearly 100% certainty these are cherry picked. Almost all "music generation" approaches are, which is a canary in the coal mine. Good music gen is much harder than most non-musical AI researchers assume.
I wonder if this would yield even better results if it were to be split into multiple tracks in the pre-processing stage with something like Demucs.[1]
[1] : https://www.youtube.com/watch?v=atJ_YsPFDjQ