I wonder how well the learned policy generalizes to other environments. Places like an art gallery, outside, or a cave. Could the network have learned something fundamental about monocular vision?
It would also be interesting to see if the learned policy corrects for perturbations. If we tilt the drone by hitting it, will the policy stabilize it again?
While this is a really cool result, I suspect that this approach might not be the best way to control UAVs. Dragon flies are ready to fly, avoid obstacles, perch on stuff, hunt down prey right after warming up their wings for the first time. This implies that a good amount of the flight behavior is 'hard-coded.'
Although I really can't wait until someone expands upon this approach. So instead of outputting left or right, the network could output 'stick vectors,' which translate to control stick commands. Maybe even have the network take in some sensor data and a 'move in this direction' vector. Add in a pinch of sufficiently fast video processing and we could probably learn how to do fly through an FPV course or do aggressive maneuvers to fly through someone's windows[0]
> If we tilt the drone by hitting it, will the policy stabilize it again?
My understanding of the way this is being done is that the output from the machine learning model is already a simple "left", "right", "straight on", so it's not really responsible for stabilization anyway.
That side of things is likely being handled by the drone's control software which takes those inputs, translates those into what angle the propellers need to be at to achieve it, and then translates that into the correct rotor speeds. If you hit the drone the gyroscope will pick up that it's at the wrong inclination, feed that information into the control software, and the control software will adjust rotor speeds to correct.
It would also be interesting to see if the learned policy corrects for perturbations. If we tilt the drone by hitting it, will the policy stabilize it again?
While this is a really cool result, I suspect that this approach might not be the best way to control UAVs. Dragon flies are ready to fly, avoid obstacles, perch on stuff, hunt down prey right after warming up their wings for the first time. This implies that a good amount of the flight behavior is 'hard-coded.'
Although I really can't wait until someone expands upon this approach. So instead of outputting left or right, the network could output 'stick vectors,' which translate to control stick commands. Maybe even have the network take in some sensor data and a 'move in this direction' vector. Add in a pinch of sufficiently fast video processing and we could probably learn how to do fly through an FPV course or do aggressive maneuvers to fly through someone's windows[0]
[0]https://www.youtube.com/watch?v=MvRTALJp8DM