XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera

Mehta, Dushyant; Sotnychenko, Oleksandr; Mueller, Franziska; Xu, Weipeng; Elgharib, Mohamed; Fua, Pascal; Seidel, Hans-Peter; Rhodin, Helge; Pons-Moll, Gerard; Theobalt, Christian

詳細要約

XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera

Mehta, D., Sotnychenko, O., Mueller, F., Xu, W., Elgharib, M., Fua, P., Seidel, H.-P., Rhodin, H., Pons-Moll, G., & Theobalt, C. (2019). XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera. Retrieved from http://arxiv.org/abs/1907.00837.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/21.11116/0000-0003-FE21-A 版のパーマリンク: https://hdl.handle.net/21.11116/0000-0003-FE22-9

資料種別: 成果報告書

ファイル

表示: ファイル

非表示: ファイル

:

arXiv:1907.00837.pdf (プレプリント), 10MB

表示保存

ファイルのパーマリンク:
https://hdl.handle.net/21.11116/0000-0003-FE23-8

ファイル名:
arXiv:1907.00837.pdf

説明:
File downloaded from arXiv at 2019-07-09 10:40

OA-Status:

閲覧制限:
公開

MIMEタイプ / チェックサム:
application/pdf / [MD5]

技術的なメタデータ:

表示

著作権日付:
-

著作権情報:
-

CCライセンス:
http://arxiv.org/licenses/nonexclusive-distrib/1.0/

作成者

表示:

非表示:

作成者:
Mehta, Dushyant¹, 著者
Sotnychenko, Oleksandr¹, 著者
Mueller, Franziska¹, 著者
Xu, Weipeng¹, 著者
Elgharib, Mohamed¹, 著者
Fua, Pascal², 著者
Seidel, Hans-Peter¹, 著者
Rhodin, Helge², 著者
Pons-Moll, Gerard³, 著者
Theobalt, Christian¹, 著者

所属:
1Computer Graphics, MPI for Informatics, Max Planck Society, ou_40047
2External Organizations, ou_persistent22
3Computer Vision and Machine Learning, MPI for Informatics, Max Planck Society, ou_1116547

内容説明

表示:

非表示:

キーワード: Computer Science, Computer Vision and Pattern Recognition, cs.CV,Computer Science, Graphics, cs.GR

要旨: We present a real-time approach for multi-person 3D motion capture at over 30
fps using a single RGB camera. It operates in generic scenes and is robust to
difficult occlusions both by other people and objects. Our method operates in
subsequent stages. The first stage is a convolutional neural network (CNN) that
estimates 2D and 3D pose features along with identity assignments for all
visible joints of all individuals. We contribute a new architecture for this
CNN, called SelecSLS Net, that uses novel selective long and short range skip
connections to improve the information flow allowing for a drastically faster
network without compromising accuracy. In the second stage, a fully-connected
neural network turns the possibly partial (on account of occlusion) 2D pose and
3D pose features for each subject into a complete 3D pose estimate per
individual. The third stage applies space-time skeletal model fitting to the
predicted 2D and 3D pose per subject to further reconcile the 2D and 3D pose,
and enforce temporal coherence. Our method returns the full skeletal pose in
joint angles for each subject. This is a further key distinction from previous
work that neither extracted global body positions nor joint angle results of a
coherent skeleton in real time for multi-person scenes. The proposed system
runs on consumer hardware at a previously unseen speed of more than 30 fps
given 512x320 images as input while achieving state-of-the-art accuracy, which
we will demonstrate on a range of challenging real-world scenes.

資料詳細

表示:

非表示:

言語: eng - English

日付: 作成: 2019-07-01オンライン出版: 2019

出版の状態: オンラインで出版済み

ページ: 18 p.

出版情報: -

目次: -

査読: -

識別子（DOI, ISBNなど）: arXiv: 1907.00837
URI: http://arxiv.org/abs/1907.00837
BibTex参照ID: Mehta_arXiv1907.00837

学位: -

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物