Aerial Robotics

 

At FRPG we study perception and control of autonomous aerial robots. We also develop novel aerial platforms to solve challenging real-world problems.


Aerial Perception: State-of-the-art deep neural network (DNN) based methods can solve a number of perception problems like detection, classification and identification. We develop novel DNN-based methods that can be deployed on small computing hardware suitable for aerial platforms, and can run on-board and in real-time. To this end, our methods leverage communication between multiple aerial robots as well as contextual knowledge such as mutual observation of the same target.  

 

Novel Aerial Platforms: Certain real world problems, such as monitoring animals in their natural habitat, require aerial platforms that are relatively silent and can operate for longer duration without needing to recharge. To this end, we develop small autonomous airships (<10m) that satisfy the above needs while being able to carry payloads up to 1kg, sufficient to deploy an on-board camera and a computer with GPU.


Multi-Robot Systems

 

At FRPG, multi-robot systems form the backbone of most of our research. We study both centralized and distributed perception and control of robots in a team performing cooperative tasks. We especially focus on the intertwining of perception and control in this context

Formation control: To achieve a required degree of quality in perception, usually quantified as estimation uncertainty, a team of robots/agents need to coordinate their actions and share independently-obtained information with their teammates. To achieve this, we develop classical (model predictive control) and AI-based methods (reinforcement learning) that explicitly account for perceptual gain in their objective function or rewards, respectively. At the same time the formation control methods must account for environmental and task-related constraints such as collision avoidance and angular configuration maintenance. To this end, we have introduced a novel force function-based approach that keeps the formation objective convex. In other methods, we have focused on heterogeneous robot teams.

 

Cooperative Perception: Fusion of information among robots/agents in a team is fundamental for a coherent understanding of the environment and, consequently, for reliable decision making as a team. In this regard, we investigate and develop a range of methods for cooperative target tracking, cooperative localization of robots as well as unified methods for performing tracking and localization simultaneously. Furthermore, we also investigate cooperative human and animal pose estimation methods. Our methods include both parametric (EKF) and non-parametric (Particle Filters) approaches.


Animal Behavior and Motion Capture

 

Motion Capture means estimation of pose and shape of an entity. Pose means the skeletal position (position of major joints in 3D space), and shape can be visualized as a mesh of vertices on the subject's body. At FRPG, our core interest in this context is to study and develop vision-based methods for animal and human MoCap in outdoor, unknown and unstructured environments. These include, for example, wild animals in their natural habitat.


Animal MoCap: Animal MoCap can help us understand (and quantitatively estimate) their behavior. Behaviors include primitive ones such as walking, running, foraging and complex ones such as interaction among peers and with the environment. By studying how their behavior varies over days, seasons and environmental conditions, it is likely that we can map behavioral changes in species to changes in the environment. Subsequently, this can allow policy makers and ecologists us to develop scientific evidence-based roadmaps for biodiversity preservation. To perform animal MoCap, we are developing multi-view methods that fuse information from aerial images of animals, obtained simultaneously from cameras mounted on multiple aerial vehicles that track and follow the animals.
       
Animal Behavior: While one of the downstream goals of animal MoCap is to estimate behaviors, such as walking, running, drinking, etc., automatically from the images, we are first studying the effect of UAV noise (sound) on the animal behavior. It is important to establish the distances/altitudes and aerial robot configurations that affect the natural behavior of the animals -- the very same behavior we wish to estimate using aerial robots. To this end, we perform extensive flight experiments around various species in their natural habitat and and manually log changes in their behavior.


Animal Datasets: In order to develop vision-based methods for animal MoCap, we take the deep learning approach. Animals need to be detected, classified and identified in camera images. To train networks for this, a large amount of annotated dataset of real animals is required. To this end, we perform data collection at various locations, including the Wilhelma zoo in Stuttgart, Hortobagy National Park in Hungary and Mpala conservancy in Kenya.
       
Human MoCap: We have developed various methods for human MoCap from aerial images acquired simultaneously from multiple aerial robots. The methods range from optimization based to end-to-end learning based. Typically we employ 2D joint detectors that provide measurements of joints on images. A body model, learned using large number of human body scans is used as a prior to predict the measurements, assuming arbitrary camera extrinsics. Thereafter, the body model parameters and the camera extrinsics are jointly optimized to explain the 2D measurements in the least squares sense. The end-to-end method learns to directly predict the body model parameters using as input only the images from an aerial robot and compact measurements communicated to it from its teammates. We also develop methods for real-time execution of these MoCap methods on the aerial robot's on-board computer.

 


Human Motion Capture

 

Human MoCap: We have developed various methods for human MoCap from aerial images acquired simultaneously from multiple aerial robots. The methods range from optimization based to end-to-end learning based. Typically we employ 2D joint detectors that provide measurements of joints on images. A body model, learned using large number of human body scans is used as a prior to predict the measurements, assuming arbitrary camera extrinsics. Thereafter, the body model parameters and the camera extrinsics are jointly optimized to explain the 2D measurements in the least squares sense. The end-to-end method learns to directly predict the body model parameters using as input only the images from an aerial robot and compact measurements communicated to it from its teammates. We also develop methods for real-time execution of these MoCap methods on the aerial robot's on-board computer.

 


Reinforcement Learning

 

At FRPG, we investigate reinforcement learning (RL) based methods for aerial robot control. This ranges from low-level control to high-level guidance. A special emphasis is put on developing hybrid approaches combining RL and classical control to preserve stability properties. Another key focus is on interpretability of the RL policy and hyper-parameter selection.


RL-based Formation Control: For many complex perception problems, such as human or animal MoCap, it is hard to model a reliable objective function that models the perception uncertainty. These models can not only be difficult to derive but also highly non-convex. We therefore take a deep RL approach to solve the perception-driven formation control problem. The key idea is to model the problem as a sequential decision making problem and learn optimal policies for each robot/agent in the team that maximizes the joint MoCap accuracy -- embedded as rewards only during training. For human MoCap, we learned navigation policies for multiple aerial robots purely in simulation and deploy on real robots. Here observations were joint detections on 2D images obtained by the robots. Ongoing work involves end-to-end DRL methods that overcome the need for joint detections, and DRL approach for animal MoCap.

 

RL-based Guidance and Control: In this context we develop various RL and DRL-based methods ranging from airship control, autonomous landing of multi-rotors on moving platforms, autonomous soaring of gliders, and collision avoidance in large teams of robots. For airship control, we have developed a DRL method that combines a classical PID with learning and leverages PID's stability properties. For autonomous landing, our RL approach learns an optimal policy for 2D horizontal motion of the robot, with a special focus on interpretable hyper-parameter derivation. For autonomous soaring, we develop DRL based approaches that maximize the exploitation of updrafts and balance it with a simultaneous waypoint navigation task. In the context of collision avoidance, we explore attention networks that allow scalability to a very large number of robots and interpretability of the RL policy.

 


Simultaneous Localization and Mapping (SLAM)

 

SLAM is one of the most fundamental and classical topics in robotics. At FRPG, we study SLAM from the lens of combining perception with action -- which in literature is referred as 'active SLAM'. We focus on both novel methodological (algorithmic) and system-focused approaches for active SLAM


Active SLAM methods: In this context, we focus on how a robot can efficiently explore and map its environment while minimizing the control effort and the distance it travels to do the task. We introduce a method that splits activeness over global and local scales. At global level, the method pre-computes optimal view points, which then locally gets refined for every subsequent waypoint. We have also introduced novel utility functions that directly account for obstacles and path lengths

 

Novel robots for active SLAM: Here we approach active SLAM from a systems perspective. We have developed a robotic platform with an independently moving camera that allows our previously-developed active SLAM method to become generalizable for all kinds of robotic chasis (holonomic or not), and further reduce energy consumption. A combined state estimation method for the robot and the independently moving camera was developed within this solution.