WildCap

Funding Source: Cyber Valley

 

Open Source Code:

 

Motivation: Inferring animal behavior, e.g., whether they are standing, grazing, running or interacting with each other and with their environment, is a fundamental requirement for addressing the most important ecological problems today. On the other hand, estimating the 3D pose and shape of animals, in real-time, could also directly address several other problems such as disease diagnosis, health profiling and possibly provide very high resolution behavior inference. To do both of these in the wild and without any markers or sensors on the animal is an extremely challenging problem. State-of-the-art methods for animal behavior, pose and shape estimation either require sensors or markers on the animals (e.g., GPS collars and IMU tags), or rely on camera traps fixed in the animal's environment. Not only do these methods pose danger to the animals due to tranquilization and physical interference, but their scope is also difficult to extend to a larger number of animals in vast environments and over longer time periods. In WildCap, we are developing autonomous methods for estimating behavior, pose and shape of endangered wild animals, which will address the aforementioned issues. Our methods does not require any physical interference with the animals. Our novel approach is to develop a team of intelligent, autonomous and vision-based aerial robots which will detect, track and follow the wild animals and perform behavior pose and shape estimation tasks.

 

Goals and Objectives:

WildCap's goal is to achieve continuous, accurate and on-board inference of behavior, pose and shape of animal species from multiple, unsynchronized and close-range aerial images acquired in the animal's natural habitat, without any sensors or markers on the animal, and without modifying the environment. In pursuit of the above goal, the key objectives of this project are

  1. Animal behavior, pose and shape estimation methods from multiple unsynchronized images.
  2. Development of novel aerial platforms for tracking and following animals.
  3. Formation control methods for multiple aerial robots to maximize the accuracy of animal behavior, pose and shape inference.

Methodology: Aerial robots with longer autonomy time and payload are critical for continuous and long distance tracking of animals in the wild. To this end, we are developing novel systems, particularly lighter than air vehicles that could address these issues. Furthermore, we are developing formation control strategies for such vehicles to maximize the visual coverage of animals and accuracy in their state estimates. Finally, we are leveraging learning-in-simulation methods to develop algorithms for animal behavior, pose and shape estimation of animals.

 

Publications:

 

[1] Price, E., Khandelwal P. C., Rubenstein D. I., and Ahmad, A (2023).  A Framework for Fast, Large-scale, Semi-Automatic Inference of Animal Behavior from Monocular Videos, bioRxiv 2023.07.31.551177; doi: https://doi.org/10.1101/2023.07.31.551177   

[2] Price, E., Black, M., & Ahmad, A. (2023) Viewpoint-driven Formation Control of Airships for Cooperative Target Tracking, IEEE Robotics and Automation Letters, vol. 8, no. 6, pp. 3653-3660, June 2023. doi: https://doi.org/10.1109/LRA.2023.3264727 or at https://arxiv.org/abs/2209.13040
 
[3] Bonetto, E. & Ahmad, A. (2023) Synthetic Data-based Detection of Zebras in Drone Imagery, IEEE European Conference on Mobile Robots (ECMR 2023) (Accepted June 2023). Preprint available at https://is.mpg.de/uploads_file/attachment/attachment/718/ECMR_Zebra.pdf

[4] Price, E. & Ahmad, A. (2023) Accelerated Video Annotation driven by Deep Detector and Tracker, 18th International Conference on Intelligent Autonomous Systems (IAS 18) (Accepted April 2023). Preprint available at https://arxiv.org/abs/2302.09590

[5] Zuo, Y., Liu, Y. T. and Ahmad A. (2023), "Autonomous Blimp Control Via H∞ Robust Deep Residual Reinforcement Learning," 19th IEEE International Conference on Automation Science and Engineering (CASE), Auckland, New Zealand. Preprint available at https://arxiv.org/abs/2303.13929.

[6] Liu, Y. T., Price, E., Black, M.J., Ahmad, A. (2022) Deep Residual Reinforcement Learning based Autonomous Blimp Control, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Kyoto, Japan, 2022, pp. 12566-12573, 2022.  https://doi.org/10.1109/IROS47612.2022.9981182
 
[7] Saini, N., Bonetto, E., Price, E., Ahmad, A., & Black, M. J. (2022). AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation. IEEE Robotics and Automation Letters, 7(2), 4805–4812. https://doi.org/10.1109/LRA.2022.3145494

[8] Price, E., Liu, Y. T., Black, M.J., Ahmad, A. (2022). Simulation and Control of Deformable Autonomous Airships in Turbulent Wind. In: Ang Jr, M.H., Asama, H., Lin, W., Foong, S. (eds) Intelligent Autonomous Systems 16 (IAS 2021). Lecture Notes in Networks and Systems, vol 412. Springer, Cham. https://doi.org/10.1007/978-3-030-95892-3_46


AirCap

Funding Source: Max Planck Institute Grassroots Funding

 

Motivation: Human pose tracking and full body pose estimation and reconstruction in outdoor, unstructured environments is a highly relevant and challenging problem. Its wide range of applications includes search and rescue, managing large public gatherings, and coordinating outdoor sports events. In indoor settings, similar applications usually make use of body-mounted sensors, artificial markers and static cameras. While such markers might still be usable in outdoor scenarios, dynamic ambient lighting conditions and the impossibility of having environment-fixed cameras make the overall problem difficult. On the other hand, body-mounted sensors are not feasible in several situations (e.g., large crowds of people). Therefore, our approach to the aforementioned problem involves a team of micro aerial vehicles (MAVs) tracking subjects by using only on-board monocular cameras and computational units, without any subject-fixed sensor or marker.

 

Goals and Objectives:

AirCap's goal is to achieve markerless, unconstrained, human motion capture (mocap) in unknown and unstructured outdoor environments. To that end, our objectives are

  • to developed an autonomous flying motion capture system using a team of aerial vehicles (MAVs) (robots) with only on-board, monocular RGB cameras [1] [3] [4]. 
  • to use the images, captured by these robots, for human body pose and shape estimation with sufficiently high accuracy [2].

Methodology:

  • Control: Autonomous MoCap systems rely on robots with on-board cameras that can localize and navigate autonomously. More importantly, these robots must detect, track and follow the subject (human or animal) in real time. Thus, a key component of such a system is motion planning and control of multiple robots that ensures optimal perception of the subject while obeying other constraints, e.g., inter-robot and static obstacle collision avoidance. Our approach to this formation control problem is based on model predictive control (MPC). An important challenge is to handle collision avoidance as the constraint itself is non-convex and leads to local minima that are not easily identifiable. A possible approach is to treat it as a separate planning module that modifies the MPC-generated optimization trajectory using potential fields. This leads to sub-optimal trajectories and field local minima. In our work [5] we provide a holistic solution to this problem. Instead of directly using repulsive potential field functions to avoid obstacles, we replace them by their exact value at every iteration of the MPC and treat them as external input forces in the system dynamics. Thus, the problem remains convex at every time step. As long as a feasible solution exists for the optimization, obstacle avoidance is guaranteed. Even though field local minima issues remain, they become easier to identify and resolve. To this end, we propose and validate multiple strategies.

    We further address the complete problem of perception-driven formation control of multiple aerial robots for tracking a human using multiple aerial vehicles [3]. For this, a decentralized convex MPC is developed that generates collision free formation motion plans while minimizing the jointly estimated uncertainty in the tracked person's position estimate. This estimation is performed using a cooperative approach [1] similar to the one developed in an another previous in our group [6]. We validated the real-time efficacy of the proposed algorithm through several field experiments with 3 self-designed octocopters and simulation experiments in a realistic outdoor environmental setting with up to 16 robots.
  • Perception: The perception functionality of AirCap is split into two phases, namely, i) online data acquisition, and ii) offline pose and shape estimation. During the online data acquisition phase, the MAVs detect and track the 3D position of a subject while following them. To this end, they perform online and on-board detection using a deep neural network (DNN)-based detector. DNNs often fail at detecting small-scale objects or those that are far away from the camera, which are typical in scenarios with aerial robots. In our solution [1], the mutual world knowledge about the tracked person is jointly acquired by our multi-MAV system during cooperative person tracking. Leveraging this, our method  actively selects the relevant region of interest (ROI) in images from each MAV that supplies the highest information content. Our method not only reduces the information loss incurred by down-sampling the high-res images, but also increases the chance of the tracked person being completely in the field of view (FOV) of all MAVs. The data acquired in the online data acquisition phase consists of images captured by all MAVs (see, for example, the left image above) and their estimated camera extrinsic and intrinsic parameters.

    In the second phase, which is offline, human pose and shape as a function of time are estimated using only the acquired RGB images and the MAV's self-localization (the camera extrinsics). Using state-of-the-art methods like VNect and HMR, one obtains only a noisy 3D estimate of the human pose. Our approach [2] is to exploit multiple noisy 2D body joint detectors and noisy camera pose information. We then optimize for body shape, body pose, and camera extrinsics by fitting the SMPL body model to the 2D observations. This approach uses a strong body model to take low-level uncertainty into account and results in the first fully autonomous flying mocap system.

Publications:

 

[1] Deep Neural Network-based Cooperative Visual Tracking through Multiple Micro Aerial Vehicles, Price, E., Lawless, G., Ludwig, R., Martinovic, I., Buelthoff, H. H., Black, M. J., Ahmad, A., IEEE Robotics and Automation Letters, Robotics and Automation Letters, 3(4):3193-3200, IEEE, October 2018, Also accepted and presented in the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Markerless Outdoor Human Motion Capture Using Multiple Autonomous Micro Aerial Vehicles, Saini, N., Price, E., Tallamraju, R., Enficiaud, R., Ludwig, R., Martinović, I., Ahmad, A., Black, M., Proceedings 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages: 823-832, IEEE, October 2019

[3] Active Perception based Formation Control for Multiple Aerial Vehicles, Tallamraju, R., Price, E., Ludwig, R., Karlapalem, K., Bülthoff, H. H., Black, M. J., Ahmad, A., IEEE Robotics and Automation Letters, Robotics and Automation Letters, 4(4):4491-4498, IEEE, October 2019

[4] AirCap – Aerial Outdoor Motion Capture, Ahmad, A., Price, E., Tallamraju, R., Saini, N., Lawless, G., Ludwig, R., Martinovic, I., Bülthoff, H. H., Black, M. J.
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2019), Workshop on Aerial Swarms, November 2019.

[5] Decentralized MPC based Obstacle Avoidance for Multi-Robot Target Tracking Scenarios, Tallamraju, R., Rajappa, S., Black, M. J., Karlapalem, K., Ahmad, A.
2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pages: 1-8, IEEE, August 2018

[6] An Online Scalable Approach to Unified Multirobot Cooperative Localization and Object Tracking, Ahmad, A., Lawless, G., Lima, P., IEEE Transactions on Robotics (T-RO), 33, pages: 1184 - 1199, October 2017

 


AeRoShip

Funding Source: University of Stuttgart

 

Motivation: Lighter-than-air vehicles (LTAVs) such as rigid and nonrigid dirigibles, or airships, are clearly a superior choice for some applications like wildlife monitoring, sports event capture and continuous patrolling of forest regions for anti-poaching tasks. LTAVs are uniquely suited as aerial communication relays, for wildlife monitoring and conservation tasks, which we address in one of our other projects, WildCap. Airships produce little noise, have high energy efficiency and long flight times, display benign collision and crash characteristics, pose low danger and cause little environmental impact. However, their comparably high handling complexity, size, lifting gas requirements and cost create an entry barrier for researchers. Unlike for heavier-than-air drones, there have been no off the shelf flight controllers that support autonomous dirigible flight. Therefore, guidance and control algorithms have to be implemented for each vehicle - even though various suitable control strategies can be found in the literature. Similar to both rotor craft and fixed wing UAVs, dirigibles come in many types of actuator arrangements: Fixed or vectoring main thrusters, differential thrust, different tail fin arrangements and auxiliary thrusters, single or double hull, etc. Thus, a control algorithm for a specific vehicle might not always be applicable to others.

 

Goals and Objectives:

In project AeRoShip our goal is to develop a team autonomous airships which is capable of monitoring wildlife for long durations. To this end, our objectives are

  • To develop robotic airship hardware and flight control methods.
  • To develop novel planning and navigation methods for single and multiple airships performing visual monitoring in an obstacle-rich environment.

Methodology:

  • Fixed wing and multirotor UAVs are common in the field of robotics. Solutions for simulation and control of these vehicles are ubiquitous. This is not the case for airships, a simulation of which needs to address unique properties, i) dynamic deformation in response to aerodynamic and control forces, ii) high susceptibility to wind and turbulence at low airspeed, and iii) high variability in airship designs regarding placement, direction and vectoring of thrusters and control surfaces. In our most recent work [1] , we are designing a flexible framework for modeling, simulation and control of airships, based on the Robot operating system (ROS), simulation environment (Gazebo) and commercial off the shelf (COTS) electronics, all of which are open source. Based on simulated wind and deformation, we predict substantial effects on controllability, verified in real world flight experiments. All our code is shared as open source, for the benefit of the community and to facilitate lighter-than-air vehicle (LTAV) research. It can be found here (https://github.com/robot-perception-group/airship_simulation)

  • In addition to the classical methods, we are also investigating reinforcement learning based approaches for autonomous control of airships. Airships' dynamics are complex. Due to its size, the air drag on its body is huge which leads to significant time-delay and higher sensitivity to disturbance. The size and shape of an airship depend on the properties of the surrounding air. Change in the environment often leads to blimp deformation and buoyancy change. Furthermore, the airship often has no direct control over its lateral movement. Our insight is that the lateral error can be compensated by sophisticated long-term planning. Therefore, one of our goals in this project is to derive a robust control policy that can handle such complex dynamics and still achieve high performance.

Publications:

 

[1] Price, E., Liu, Y. T., Black, M. J., & Ahmad, A. (2022). Simulation and Control of Deformable Autonomous Airships in Turbulent Wind. In M. H. Ang Jr, H. Asama, W. Lin, & S. Foong (Hrsg.), Intelligent Autonomous Systems 16 (S. 608--626). Springer International Publishing. https://doi.org/10.1007/978-3-030-95892-3_46

[2]  Liu, Y.T., Price, E., Black, M.J., Ahmad, A. (2022)    Deep Residual Reinforcement Learning based Autonomous Blimp Control, 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Oct 2022.

 


Autonomous Soaring

Funding Source: Cyber Valley

 

Motivation: Aeronautics is currently undergoing a major transition worldwide. The pandemic has shown that airline traffic can be drastically reduced with an extremely beneficial effect on atmospheric emissions. At the same time, there are large investments and developments in the field of individual mobility, such as urban air mobility vehicles and unmanned aircraft applications, e.g. for medical transportation or rescue purposes. The majority of these aircraft or aerial vehicles are based on electric propulsion and thus, range and endurance are quite limited. In the case of fixed-wing aircraft, these parameters can be significantly improved by exploiting (harvesting) energy from the atmospheric environment. An extreme example is conventional soaring, which is a glider flight, where a pilot combines experience, skills, knowledge, and perception in a decision-making process in such a way, that updrafts are detected and exploited to maximize flight range while keeping situational awareness at all times. These tasks are very complex and can only be accomplished by highly trained pilots. The objective of this work is to find systematic approaches to autonomously maximize the exploitation of environmental energy for flights with small fixed-wing aircraft (unmanned or manned, <2t), while, at the same time, minimizing the flight duration for a required distance. The underlying problem is the trade of short-term rewarding actions (covering some distance) against actions that are expected to pay off in the long term (mapping and exploiting atmospheric updrafts) while navigating in a complex, particularly hard-to-model environment. This constitutes a challenging decision-making problem. Autonomous soaring serves as a test scenario.

 

Goals and Objectives:

  • The application-related objective of this project is to find systematic approaches to autonomously maximize the exploitation of environmental energy in flight, while, at the same time, minimizing the flight duration for a required distance. This is fundamental to all-electric fixed-wing aircraft (unmanned or manned, < 2 t) and can lead to significant energy savings.
  • The associated scientific objective is to manage the trade between short-term rewarding actions (i.e., pursuing the prime mission objective of covering some distance) and actions that are expected to pay off in the long-term (i.e., mapping and exploiting atmospheric updrafts) while navigating in a complex, particularly hard-to-model environment. A solution to this would bring autonomous capabilities close to, or even beyond human performance

Publications:

[1] Stefan Notter, Fabian Schimpf, Gregor Müller, and Walter Fichter, Hierarchical Reinforcement Learning Approach for Autonomous Cross-Country Soaring, Journal of Guidance, Control, and Dynamics 2023 46:1, 114-126

 


AirCapRL

Funding Source: Max Planck Institute Grassroots Funding

 

Motivation: Realizing an aerial motion capture (MoCap) system for humans or animals involves several challenges. The system's robotic front-end [1,3] must ensure that the subject is i) accurately and continuously followed by all aerial robots (UAVs), and ii) within the field of view (FOV) of the cameras of all robots. The back-end [2] of the system estimates the 3D pose and shape of the subject, using the images and other data acquired by the front-end. The front-end poses a formation control problem for multiple MAVs, while the back-end requires an estimation method. 

 

In existing solutions for outdoor MoCap systems, including ours [1] [2] [3], the front and back end are developed independently. The formation control algorithms of the existing aerial MoCap front ends consider that their objective should be to center the person in every UAV's camera image and she/he should be within a threshold distance to each MAV. These assumptions are intuitive and important. Also, experimentally it has been shown that it leads to a good MoCap estimate. However, it remains sub-optimal without any feedback from the human pose estimation back-end of the MoCap system. This is because the estimated human pose is strongly dependent on the viewpoints of the UAVs.

 

Goals and Objectives:

In this project, our goal is to develop an end-to-end approach for human and animal motion capture - a MoCap system where the UAV-based front-end is completely driven by the back-end's accuracy.

To this end, our objectives are

  • to develop a formation control method for multiple UAVs that is driven by the performance of the MoCap back end.
  • to develop an integrated end-to-end method, where the UAV formation control policies are learned based on back-end's estimation accuracy, while at the same time the back-end improves its estimation accuracy using the images acquired by the front-end.

Methodology:  In our work [4], we introduce a deep reinforcement learning (RL) based multi-robot formation controller for the task of autonomous aerial human motion capture (MoCap). We focus on vision-based MoCap, where the objective is to estimate the trajectory of body pose and shape of a single moving person using multiple micro aerial vehicles. State-of-the-art solutions to this problem are based on classical control methods, which depend on hand-crafted system and observation models. Such models are difficult to derive and generalize across different systems. Moreover, the non-linearity and non-convexities of these models lead to sub-optimal controls. We formulate this problem as a sequential decision making task to achieve the vision-based motion capture objectives, and solve it using a deep neural network-based RL method. We leverage proximal policy optimization (PPO) to train a stochastic decentralized control policy for formation control. The neural network is trained in a parallelized setup in synthetic environments. We performed extensive simulation experiments to validate our approach. Finally, real-robot experiments demonstrate that our policies generalize to real world conditions.

 

Publications:

 

[1] Deep Neural Network-based Cooperative Visual Tracking through Multiple Micro Aerial Vehicles, Price, E., Lawless, G., Ludwig, R., Martinovic, I., Buelthoff, H. H., Black, M. J., Ahmad, A., IEEE Robotics and Automation Letters, Robotics and Automation Letters, 3(4):3193-3200, IEEE, October 2018, Also accepted and presented in the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[2]  Markerless Outdoor Human Motion Capture Using Multiple Autonomous Micro Aerial Vehicles, Saini, N., Price, E., Tallamraju, R., Enficiaud, R., Ludwig, R., Martinović, I., Ahmad, A., Black, M., Proceedings 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages: 823-832, IEEE, October 2019

[3] Active Perception based Formation Control for Multiple Aerial Vehicles, Tallamraju, R., Price, E., Ludwig, R., Karlapalem, K., Bülthoff, H. H., Black, M. J., Ahmad, A., IEEE Robotics and Automation Letters, Robotics and Automation Letters, 4(4):4491-4498, IEEE, October 2019

[4]  AirCapRL: Autonomous Aerial Human Motion Capture Using Deep Reinforcement Learning, Tallamraju, R., Saini, N., Bonetto, E., Pabst, M., Liu, Y. T., Black, M., Ahmad, A., IEEE Robotics and Automation Letters, IEEE Robotics and Automation Letters, 5(4):6678 - 6685, IEEE, October 2020, Also accepted and presented in the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).


ActiveSLAM

Funding Source: Max Planck Institute and the University of Stuttgart

 

Motivation: Robots that are capable to help humans in everyday tasks, either in workplaces or homes, are becoming rapidly popular. To be fully functional companions, robots should be able to navigate and map unknown environments in a seamless and efficient way -- quickly, using less energy and without navigating unnecessarily. Simultaneous localization and mapping, popularly called SLAM, has been developed mainly as a passive process where robots are only required to follow external control inputs, are controlled directly by humans or where previous knowledge is being exploited through predefined waypoints or landmarks. Active SLAM, on the other hand, refers to an approach in which robots exploit their sensors measurements and based on that take control decisions to increase map information, while simultaneously performing other user-defined tasks in an energy-efficient way.

 

Goals and Objectives:

We envision ActiveSLAM as a long-term project, where we develop novel methods for active SLAM approaches, suitable for various robotic platforms, either indoor or outdoor, ground-based or aerial, for single-robot or multi-robot systems.

Methodology:  In our most recent work in ActiveSLAM, we introduce an active visual SLAM approach for our omnidirectional robot 'Robotino' [1]. It focuses both on the exploration of the unknown parts of the environment and on the re-observation of already mapped areas to improve the so called 'coverage information' for a better overall map quality. We employ activeness at two different levels -- the first one acts on the global planning, through informative path planning. This is done by selecting the best path and headings at every waypoint on that path using the information provided by the global occupancy map. The second one influences only the short-term movement using the real-time local distribution of 3D visual features. Inside the utility function, we use Shannon's entropy measure and balance between exploration and coverage behaviours. By exploiting all the available information to drive the camera direction (since our robot is omni directional), we are able to maximize the amount of information gathered during the robot's movement between waypoints [2].

 

Publications:

 

[1] Bonetto, E., Goldschmid, P., Pabst, M., Black, M. J., & Ahmad, A. (2022). iRotate: Active visual SLAM for omnidirectional robots. Robotics and Autonomous Systems. https://doi.org/10.1016/j.robot.2022.104102

[2] Bonetto, E., Goldschmid, P., Black, M. J., & Ahmad, A. (2021). Active Visual SLAM with Independently Rotating Camera. 2021 European Conference on Mobile Robots (ECMR), 1–8. https://doi.org/10.1109/ECMR50962.2021.9568791