We present WorldPose, a novel dataset for advancing research in multi-person global pose estimation in the wild, featuring footage from the 2022 FIFA World Cup. While previous datasets have primarily focused on local poses, often limited to a single person or in constrained, indoor settings, the infrastructure deployed for this sporting event allows access to multiple fixed and moving cameras in different stadiums. We exploit the static multi-view setup of HD cameras to recover the 3D player poses and motions with unprecedented accuracy given capture areas of more than 1.75 acres. We then leverage the captured players’ motions and field markings to calibrate a moving broadcasting camera. The resulting dataset comprises 88 sequences with more than 2.5 million 3D poses and a total traveling distance of over 120 km. Subsequently, we conduct an in-depth analysis of the SOTA methods for global pose estimation. Our experiments demonstrate that WorldPose challenges existing multi-person techniques, supporting the potential for new research in this area and others, such as sports analysis. All pose annotations (in SMPL format), broadcasting camera parameters and footage will be released for academic research purposes.
Please fill out the data request form to access the dataset (which includes 3D poses and camera parameters). The footage is owned by FIFA and is subject to an additional agreement. Instructions for accessing the videos will be sent once the applications have been reviewed.
@article{jiang2024worldpose,
author = {Jiang, Tianjian and Billingham, Johsan and Müksch, Sebastian and Zarate, Juan and Evans, Nicolas and Oswald, Martin and Pollefeys, Marc and Hilliges, Otmar and Kaufmann, Manuel and Song, Jie},
title = {WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation},
journal = {eccv},
year = {2024},
}