Drone Uses AI and 11,500 Crashes to Learn How to Fly
“Learning to Fly by Crashing,” a paper from CMU roboticists Dhiraj Gandhi, Lerrel Pinto, and Abhinav Gupta, has such a nice abstract that I’ll just let them explain what this research is all about:
[T]he gap between simulation and real world remains large especially for perception problems. The reason most research avoids using large-scale real data is the fear of crashes! In this paper, we propose to bite the bullet and collect a dataset of crashes itself! We build a drone whose sole purpose is to crash into objects [. . .] We use all this negative flying data in conjunction with positive data sampled from the same trajectories to learn a simple yet powerful policy for UAV navigation.
Cool, let’s get crashing!
One way to think of flying (or driving or walking or any other form of motion) is that success is simply a continual failure to crash. From this perspective, the most effective way of learning how to fly is by getting a lot of experience crashing so that you know exactly what to avoid, and once you can reliably avoid crashing, you by definition know how to fly. Simple, right? We tend not to learn this way, however, because crashing has consequences that are usually quite bad for both robots and people.
The CMU roboticists wanted to see if there are any benefits to using the crash approach instead of the not crash approach, so they sucked it up and let an AR Drone 2.0 loose in 20 different indoor environments, racking up 11,500 collisions over the course of 40 hours of flying time. As the researchers point out, “since the hulls of the drone are cheap and easy to replace, the cost of catastrophic failure is negligible.” Each collision is random, with the drone starting at a random location in the space and then flying slowly forward until it runs into something. After it does, it goes back to its starting point, and chooses a new direction. Assuming it survives, of course.Once a collision happens, the images from the trajectory are split into two parts: the part where the drone was doing fine, and the part just before it crashes. These two sets of images are fed into a deep convolutional neural network, which uses them to learn whether a given camera image means that going straight is a good idea or not. After 11,500 collisions, the resulting algorithm is able to fly the drone autonomously, even in narrow, cluttered environments.
During this process, the drone’s forward-facing camera is recording images at 30 Hz. Once a collision happens, the images from the trajectory are split into two parts: the part where the drone was doing fine, and the part just before it crashes. These two sets of images are fed into a deep convolutional neural network (with ImageNet-pretrained weights as initialization for the network), which uses them to learn, essentially, whether a given camera image means that going straight is a good idea or not. After 11,500 collisions, the resulting algorithm is able to fly the drone autonomously, even in narrow, cluttered environments, around moving obstacles, and in the midst of featureless white walls and even glass doors. The algorithm that controls the drone is simple: It splits the image from the AR Drone’s forward camera into a left image and a right image, and if one of those two images looks less collision-y than going straight, the drone turns in that direction. Otherwise, it continues moving forward.
How well does this work? It’s usually not as good as a human pilot, except in relatively complex environments, like narrow hallways or hallways with chairs. But compared to a baseline approach using monocular depth estimation, it’s massively better, somewhere between 2x and 10x the performance (in both time in the air and distance flown), depending on the environment. The biggest benefit comes from navigating around featureless walls and glass doors, both of which are notoriously challenging for depth estimation.
The obvious question to ask is whether this method is actually more effective than the alternative, which is teaching a drone to fly through not crashing instead. I’m not sure what the answer is, but the point is that if you allow crashing, the entire learning process can be self-supervised: Just set the drone up in a room and let it do its thing. You’ll have to change the batteries (and the hull, on occasion) but otherwise all of the data collection and learning is completely autonomous. If, on the other hand, you try to teach a drone to fly through not crashing, you have to find a way to make sure that it doesn’t crash. You can do that by learning from a human pilot, or putting it in some environment with a motion capture system and some 3D maps of obstacles and whatnot, but that adds cost and complexity. Crashing is so much easier.
“Learning to Fly by Crashing,” by Dhiraj Gandhi, Lerrel Pinto, and Abhinav Gupta from the Robotics Institute at Carnegie Mellon University, can be read in its entirety at the link below.
[ Paper ]