Video-based Motion Capture

TNT members involved in this project:
Prof. Dr.-Ing. Bodo Rosenhahn
Tom Wehrbein, M.Sc.
Show all

Motion Capture is the process of analyzing movements of objects or humans from video data. Potential application fields are animation for 3D-movie production, sports science and medical applications. Instead of using artificial markers attached to the body and expensive lab equipment we are interested in tracking humans from video streams without special preparation of the subject. This is even more challenging in the context of outdoor scenes, clothed people and people interaction.

The main goal is to reconstruct the three-dimensional pose of a person from image data only. It can be split in multiple subtasks, e.g. people detection/tracking, 3d reconstruction, human model building, and animation. Our research in this field focuses on either one of this subtasks or their combination.

RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation

Computer Vision and Pattern Recognition (CVPR)

Bastian Wandt, and Bodo Rosenhahn

Abstract: This paper addresses the problem of 3D human pose estimation from single images. While for a long time human skeletons were parameterized and fitted to the observation by satisfying a reprojection error, nowadays researchers directly use neural networks to infer the 3D pose from the observations. However, most of these approaches ignore the fact that a reprojection constraint has to be satisfied and are sensitive to overfitting. We tackle the overfitting problem by ignoring 2D to 3D correspondences. This efficiently avoids a simple memorization of the training data and allows for a weakly supervised training. One part of the proposed reprojection network (RepNet) learns a mapping from a distribution of 2D poses to a distribution of 3D poses using an adversarial training approach. Another part of the network estimates the camera. This allows for the definition of a network layer that performs the reprojection of the estimated 3D pose back to 2D which results in a reprojection loss function. Our experiments show that RepNet generalizes well to unknown data and outperforms state-of-the-art methods when applied to unseen data. Moreover, our implementation runs in real-time on a standard desktop PC.

Links:

 

3D Reconstruction of Human Motion from Monocular Image Sequences

IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Bastian Wandt, Hanno Ackermann, and Bodo Rosenhahn

Abstract: This article tackles the problem of estimating non-rigid human 3D shape and motion from image sequences taken by uncalibrated cameras. Similar to other state-of-the-art solutions we factorize 2D observations in camera parameters, base poses and mixing coefficients. Existing methods require sufficient camera motion during the sequence to achieve a correct 3D reconstruction. To obtain convincing 3D reconstructions from arbitrary camera motion, our method is based on a-priorly trained base poses. We show that strong periodic assumptions on the coefficients can be used to define an efficient and accurate algorithm for estimating periodic motion such as walking patterns. For the extension to non-periodic motion we propose a novel regularization term based on temporal bone length constancy. In contrast to other works, the proposed method does not use a predefined skeleton or anthropometric constraints and can handle arbitrary camera motion. We achieve convincing 3D reconstructions, even under the influence of noise and occlusions. Multiple experiments based on a 3D error metric demonstrate the stability of the proposed method. Compared to other state-of-the-art methods our algorithm shows a significant improvement.

Links:

 

Metric Regression Forests for Human Pose Estimation

British Machine Vision Conference (BMVC)

Gerard Pons-Moll, Jonathan Taylor, Jamie Shotton, Aaron Hertzmann, and Andrew Fitzgibbon

Abstract: We present a new method for inferring dense data to model correspondences, focusing on the application of human pose estimation from depth images. Recent work proposed the use of regression forests to quickly predict correspondences between depth pixels and points on a 3D human mesh model. That work, however, used a proxy forest training objective based on the classification of depth pixels to body parts. In contrast, we introduce Metric Space Information Gain (MSIG), a new decision forest training objective designed to directly optimize the entropy of distributions in a metric space. When applied to a model surface, viewed as a metric space defined by geodesic distances, MSIG aims to minimize image-to-model correspondence uncertainty. A naïve implementation of MSIG would scale quadratically with the number of training examples. As this is intractable for large datasets, we propose a method to compute MSIG in linear time. Our method is a principled generalization of the proxy classification objective, and does not require an extrinsic isometric embedding of the model surface in Euclidean space. Our experiments demonstrate that this leads to correspondences that are considerably more accurate than state of the art, using far fewer training images.

Links:

 

Show recent publications only
  • Conference Contributions
    • Tom Wehrbein, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt
      Utilizing Uncertainty in 2D Pose Detectors for Probabilistic 3D Human Mesh Recovery
      IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), IEEE, February 2025
    • Tom Wehrbein, Bodo Rosenhahn, Iain Matthews, Carsten Stoll
      Personalized 3D Human Pose and Shape Refinement
      International Conference on Computer Vision Workshops (ICCVW), IEEE, Paris, France, October 2023
    • Tom Wehrbein, Marco Rudolph, Bodo Rosenhahn, Bastian Wandt
      Probabilistic Monocular 3D Human Pose Estimation with Normalizing Flows
      International Conference on Computer Vision (ICCV), IEEE, October 2021
    • Bastian Wandt, Bodo Rosenhahn
      RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation
      Computer Vision and Pattern Recognition (CVPR), IEEE, June 2019
    • Bastian Wandt, Hanno Ackermann, Bodo Rosenhahn
      A Kinematic Chain Space for Monocular Motion Capture
      ECCV Workshops, September 2018
    • Thiemo Alldieck, Marc Kassubeck, Bastian Wandt, Bodo Rosenhahn, Marcus Magnor
      Optical Flow-based 3D Human Motion Estimation from Monocular Video
      German Conference on Pattern Recognition (GCPR), September 2017
    • Petrissa Zell, Bastian Wandt, Bodo Rosenhahn
      Joint 3D Human Motion Capture and Physical Analysis from Monocular Videos
      The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, July 2017
    • Bastian Wandt, Hanno Ackermann, Bodo Rosenhahn
      3D Human Motion Capture from Monocular Image Sequences
      IEEE Conference on Computer Vision and Pattern Recognition Workshops, IEEE, June 2015
    • Gerard Pons-Moll+, Jonathan Taylor+, Jamie Shotton, Aaron Hertzmann, Andrew Fitzgibbon
      Metric Regression Forests for Human Pose Estimation
      British Machine Vision Conference ( BMVC ) (+ dennotes equal contribution)
      Best Science Paper Award, September 2013
    • Gerard Pons-Moll, Andreas Baak, Juergen Gall, Laura Leal-Taixe, Meinard Mueller, Hans-Peter Seidel, Bodo Rosenhahn
      Outdoor Human Motion Capture using Inverse Kinematics and von Mises-Fisher Sampling
      IEEE International Conference on Computer Vision (ICCV), November 2011
    • Andreas Baak, Thomas Helten, Meinard Müller, Gerard Pons-Moll, Bodo Rosenhahn, Hans-Peter Seidel
      Analyzing and Evaluating Markerless Motion Tracking Using Inertial Sensors
      European Conference on Computer Vision (ECCV Workshops), September 2010
    • Gerard Pons-Moll, Andreas Baak, Thomas Helten, Meinard Müller, Hans-Peter Seidel, Bodo Rosenhahn
      Multisensor-Fusion for 3D Full-Body Human Motion Capture
      IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2010
    • Nils Hasler, Thorsten Thormählen, Bodo Rosenhahn, Hans-Peter Seidel
      Learning Skeletons for Shape and Pose
      ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, Washington , February 2010
    • Gerard Pons-Moll, Bodo Rosenhahn
      Ball Joints for Marker-less Human Motion Capture
      IEEE Workshop on Applications of Computer Vision (WACV), Snow Bird, Utah, USA, December 2009
    • Nils Hasler, Bodo Rosenhahn, Thorsten Thormählen, Michael Wand, Hans-Peter Seidel
      Markerless Motion Capture with Unsynchronized Moving Cameras
      IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, USA, 2009
    • Nils Hasler, Carsten Stoll, Bodo Rosenhahn, Thorsten Thormählen, H.-P. Seidel
      Estimating Body Shape of Dressed Humans
      Shape Modeling International, Beijing, 2009
    • B. Rosenhahn, C. Schmaltz, T. Brox, J. Weickert, D. Cremers, H.-P. Seidel
      Markerless Motion Capture of Man-Machine Interaction
      IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008
    • J. Gall, B. Rosenhahn, H.-P. Seidel
      Drift-free Tracking of Rigid and Articulated Objects
      IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, Alaska, 2008
    • B. Rosenhahn, T. Brox, H.-P. Seidel
      Scaled Motion Dynamics for Markerless Motion Capture
      IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, Minnesota, USA., 2007
    • T. Brox, B. Rosenhahn, D. Cremers, H.-P. Seidel
      Nonparametric Density Estimation with Adaptive Anisotropic Kemels for Human Motion Tracking
      2nd. Workshop on Human Motion, Springer-Verlag, Berlin Heidelberg, pp. 152-165, 2007, edited by Elgammal, A.; Rosenhahn, B. ; Klette, R.
    • T. Brox, B. Rosenhahn, U. Kersting, D. Cremers
      Nonparametric Density Estimation for Human Tracking
      Pattern Recognition 2006, DAGM, Springer-Verlag, Berlin Heidelberg, pp. 546-555, Berlin, 2006, edited by Franke, K.; Mueller, R.;Nickolay, B.; Schaefer, R.
  • Journals
    • Timo von Marcard, Gerard Pons-Moll, Bodo Rosenhahn
      Human Pose Estimation from Video and IMUs
      Transactions on Pattern Analysis and Machine Intelligence, IEEE, Vol. 38, No. 8, pp. 1533-1547, January 2016
    • Bastian Wandt, Hanno Ackermann, Bodo Rosenhahn
      3D Reconstruction of Human Motion from Monocular Image Sequences
      Transactions on Pattern Analysis and Machine Intelligence, IEEE, Vol. 38, No. 8, pp. 1505-1516, 2016
    • Nils Hasler, Carsten Stoll, Martin Sunkel, Bodo Rosenhahn, Seidel Hans-Peter
      A Statistical Model of Human Pose and Body Shape
      Computer Graphics Forum (Proc. Eurographics 2009), Munich, Germany, 2009
    • B. Rosenhahn, U. Kersting, K. Powell, R. Klette, G. Klette, H.-P. Seidel
      A system for articulated tracking incorporating a clothing model
      Machine Vision and Applications, Springer Verlag, Berlin-Heidelberg, Vol. 18, No. 1, pp. 25-40, February 2007
  • Book Chapters
    • B. Rosenhahn, Uwe G. Kersting, K. Powell, T. Brox, Hans-Peter Seidel
      Tracking Clothed People
      Human Motion - Understanding, Modelling, Capture and Animation, Springer Verlag, Dordrecht, The Netherlands, Vol. 36, pp. 295-317, 2007, edited by Rosenhahn B.; Klette R.; Metaxas D.