Pre-recorded Sessions: From 4 December 2020 | Live Sessions: 10 – 13 December 2020
4 – 13 December 2020
Pre-recorded Sessions: From 4 December 2020 | Live Sessions: 10 – 13 December 2020
4 – 13 December 2020
#SIGGRAPHAsia | #SIGGRAPHAsia2020
#SIGGRAPHAsia | #SIGGRAPHAsia2020
Date/Time:
04 – 13 December 2020
All presentations are available in the virtual platform on-demand.
Lecturer(s):
Jiayi Wang, Max-Planck-Institut für Informatik, Germany
Franziska Mueller, Max-Planck-Institut für Informatik, Germany
Florian Bernard, Max-Planck-Institut für Informatik, Technische Universität München (TUM), Germany
Suzanne Sorli, Universidad Rey Juan Carlos, Spain
Oleksandr Sotnychenko, Max-Planck-Institut für Informatik, Germany
Neng Qian, Max-Planck-Institut für Informatik, Germany
Miguel A. Otaduy, Universidad Rey Juan Carlos, Spain
Dan Casas, Universidad Rey Juan Carlos, Spain
Christian Theobalt, Max-Planck-Institut für Informatik, Germany
Bio:
Description: Tracking and reconstructing the 3D pose and geometry of two hands in interaction is a challenging problem that has a high relevance for several human-computer interaction applications, including AR/VR, robotics, or sign language recognition. Existing works are either limited to simpler tracking settings (e.g., considering only a single hand or two spatially separated hands), or rely on less ubiquitous sensors, such as depth cameras. In contrast, in this work we present the first real-time method for motion capture of skeletal pose and 3D surface geometry of hands from a single RGB camera that explicitly considers close interactions. In order to address the inherent depth ambiguities in RGB data, we propose a novel multi-task CNN that regresses multiple complementary pieces of information, including segmentation, dense matchings to a 3D hand model, and 2D keypoint positions, together with newly proposed intra-hand relative depth and inter-hand distance maps. These predictions are subsequently used in a generative model fitting framework in order to estimate pose and shape parameters of a 3D hand model for both hands. We experimentally verify the individual components of our RGB two-hand tracking and 3D reconstruction pipeline through an extensive ablation study. Moreover, we demonstrate that our approach offers previously unseen two-hand tracking performance from RGB, and quantitatively and qualitatively outperforms existing RGB-based methods that were not explicitly designed for two-hand interactions. Moreover, our method even performs on-par with depth-based real-time methods.