Human motion capturing (HMC) from multiview image
sequences constitutes an extremely difficult problem due to depth and
orientation ambiguities and the high dimensionality of the state space.
In this paper, we introduce
a novel hybrid HMC system that combines video
input with sparse inertial sensor input. Employing an annealing
particle-based optimization scheme, our idea is to
use orientation cues derived from the inertial input to sample
particles from the manifold of valid poses.
Then, visual cues derived from the video input are used
to weight these particles and to iteratively derive the final
pose. As our main contribution, we propose an efficient sampling
procedure where hypothesis are derived analytically
using state decomposition and inverse kinematics on the orientation cues.
Additionally, we
introduce a novel sensor noise model to account for uncertainties
based on the von Mises-Fisher distribution.
Doing so, orientation constraints are naturally fulfilled and
the number of needed particles can be kept very small.
More generally, our method can be used to sample poses that fulfill arbitrary orientation or positional kinematic constraints.
In the experiments, we show that our system can track even
highly dynamic motions in an outdoor setting with changing
illumination, background clutter, and shadows.
Images, videos, data and code will be made available.
NOTE Page is under construction and some links might not work yet.
Video
(with voice over)
Downloads
Data
The challenging outdoor sequences used in the paper can be downloaded directly from here. The dataset consists of 8 sequences of 4 activities (walking, karate, soccer and basketball) performed by two different subjects.
Additionally, we include background images to perform background substraction.
This matlab demo script shows how to read and use the data in HMODB. It loads a mesh and a sequence and plots the 3D model with the sensor orientations in the first frame and the projection of the model into the first frame.