Manuel Kaufmann

I am a computer scientist with a background in human-centric 3D Computer Vision. I previously worked as a tech lead in the executive team of the ETH AI Center and as postdoc at the AIT Lab at ETH Zurich, Department of Computer Science. I obtained my PhD in the AIT Lab under superivsion of Prof. Otmar Hilliges, co-advised by Prof. Markus Gross. I have completed my MSc at ETH Zurich in the AIT lab and earned my BSc in Computer Science from the University of Basel under the supervision of Prof. Thomas Vetter.

I taught Machine Perception at ETH Zurich. All lecture materials including video recordings from 2025 are publicly available.

Check out aitviewer, an open-source interactive viewer for sequences of 3D data that we created at the AIT lab.

Scholar | LinkedIn | GitHub

Research

I am interested in multimodal vision-language models, human-centric multimodal data acquisition pipelines, 3D human pose estimation from wearable sensors (e.g., IMUs and electromagnetic sensors) and single or multi-view images, 3D reconstruction of human avatars from images and video, and human motion modelling. For a selection of publications, see below.

	Learning Vision-Language Alignment in Unified LLMs with 24 Text Tokens per Image Nicola Irmiger, Yixuan Xu, Raphael Kreft, Aram Davtyan, Manuel Kaufmann, Imanol Schlag IWSDS, 2026 Adapting a pre-trained LLM for flexible multimodal understanding by compressing images into 24 discrete tokens and training with a two-stage next-token prediction approach. paper
	RHINO: Reconstructing Human Interactions with Novel Objects from Monocular Videos Lixin Xue, Chengwei Zheng, Georgios Paschalidis, Chen Guo, Manuel Kaufmann, Juan Zarate, Dimitrios Tzionas CVPR, 2026 A framework that reconstructs a human, an unseen object, and the static scene in a common world frame from monocular RGB video, without requiring known 3D shapes or cameras. project page \| arXiv
	Gaussian Wardrobe: Compositional 3D Gaussian Avatars for Free-Form Virtual Try-on Zhiyi Chen, Hsuan-I Ho, Tianjian Jiang, Jie Song, Manuel Kaufmann, Chen Guo 3DV, 2026 A compositional 3D Gaussian avatar representation that enables free-form virtual try-on of clothing from multiple sources. project page \| arXiv \| video \| code
	PriorAvatar: Efficient and Robust Avatar Creation from Monocular Video Using Learned Priors Tianjian Jiang, Hsuan-I Ho, Manuel Kaufmann, Jie Song SIGGRAPH Asia (TOG), 2025 Robust and efficient monocular avatar reconstruction using a 3D human prior (a multi-person feature codebook of shapes and appearances) learned from scans to guide 3D Gaussian fitting. paper
	PHD: Personalized 3D Human Body Fitting with Point Diffusion Hsuan-I Ho, Chen Guo, Po-Chen Wu, Ivan Shugurov, Chengcheng Tang, Abhay Mittal, Sizhe An, Manuel Kaufmann, Linguang Zhang ICCV, 2025 Personalized 3D human body fitting using point diffusion for improved accuracy in body pose and shape reconstruction. project page \| arXiv \| video
	ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos Zetong Zhang, Manuel Kaufmann, Lixin Xue, Jie Song, Martin Oswald CVPR, 2025 An online Gaussian Splatting-based system for dense 3D reconstruction of both humans and their surrounding scenes from monocular video input. project page \| video
	EgoHDM: An Online Egocentric-Inertial Human Motion Capture, Localization, and Dense Mapping System Bonan Liu, Handi Yin, Manuel Kaufmann, Jinhao He, Sammy Christen, Jie Song, Pan Hui SIGGRAPH Asia (TOG), 2024 An egocentric-inertial system that jointly performs human motion capture, localization, and dense scene mapping online. project page \| arXiv \| video \| code
	ReLoo: Reconstructing Humans Dressed in Loose Garments from Monocular Video in the Wild Chen Guo, Tianjian Jiang, Manuel Kaufmann, Chengwei Zheng, Julien Valentin, Jie Song, Otmar Hilliges ECCV, 2024 Reconstructing humans wearing loose-fitting garments from in-the-wild monocular videos, handling challenging clothing dynamics. project page \| arXiv \| video \| code
	WorldPose: A World Cup Dataset for Global 3D Human Pose Estimation Tianjian Jiang, Johsan Billingham, Sebastian Müksch, Juan Zarate, Nicolas Evans, Martin Oswald, Marc Pollefeys, Otmar Hilliges, Manuel Kaufmann, Jie Song ECCV, 2024 A large-scale dataset captured from 2022 FIFA World Cup broadcasts for benchmarking global 3D human pose estimation in sports scenarios. project page \| paper \| video \| dataset
	HSR: Holistic 3D Human-Scene Reconstruction from Monocular Videos Lixin Xue, Chen Guo, Chengwei Zheng, Fangjinhua Wang, Tianjian Jiang, Hsuan-I Ho, Manuel Kaufmann, Jie Song, Otmar Hilliges ECCV, 2024 A holistic approach to jointly reconstructing humans and their surrounding 3D scene from a single monocular video. project page \| paper \| video \| code
	MultiPly: Reconstruction of Multiple People from Monocular Video in the Wild Zeren Jiang, Chen Guo, Manuel Kaufmann, Tianjian Jiang, Julien Valentin, Otmar Hilliges, Jie Song CVPR, 2024 (Oral Presentation, Top 3.3%) Reconstructing multiple interacting people from a single monocular in-the-wild video, handling occlusions and close contact. project page \| arXiv \| video \| dataset \| code
	EMDB: The Electromagnetic Database of Global 3D Human Pose and Shape in the Wild Manuel Kaufmann, Jie Song, Chen Guo, Kaiyue Shen, Tianjian Jiang, Chengcheng Tang, Juan Zarate, Otmar Hilliges ICCV, 2023 A new dataset of global 3D human pose and shape in-the-wild, captured using electromagnetic sensors for accurate ground truth. project page \| paper \| video \| dataset \| code
	ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation Zicong Fan, Omid Taheri, Dimitrios Tzionas, Muhammed Kocabas, Manuel Kaufmann, Michael Black, Otmar Hilliges CVPR, 2023 A large-scale dataset capturing dexterous bimanual manipulation of articulated objects, useful for hand-object interaction research. project page \| paper \| video \| code
	X-Avatar: Expressive Human Avatars Kaiyue Shen, Chen Guo, Manuel Kaufmann, Juan Zarate, Julien Valentin, Jie Song, Otmar Hilliges CVPR, 2023 Expressive human avatars that capture body pose, facial expressions, and hand gestures in a unified model. project page \| arXiv \| video \| code
	Hi4D: 4D Instance Segmentation of Close Human Interaction Yifei Yin, Chen Guo, Manuel Kaufmann, Juan Zarate, Jie Song, Otmar Hilliges CVPR, 2023 A dataset and method for 4D instance segmentation of two people engaged in close physical interaction. project page \| arXiv \| video \| dataset \| code
	A Spatio-temporal Transformer for 3D Human Motion Prediction Emre Aksan, Manuel Kaufmann, Peng Cao, Otmar Hilliges 3DV, 2021 A spatio-temporal transformer architecture that jointly models spatial and temporal dependencies for 3D human motion prediction. project page \| paper \| video \| code
	EM-POSE: 3D Human Pose Estimation from Sparse Electromagnetic Trackers Manuel Kaufmann, Yi Zhao, Chengcheng Tang, Lingling Tao, Christopher Twigg, Jie Song, Robert Wang, Otmar Hilliges ICCV, 2021 3D human pose estimation from sparse wireless electromagnetic trackers. project page \| paper \| video \| dataset \| code
	Convolutional Autoencoders for Human Motion Infilling Manuel Kaufmann, Emre Aksan, Jie Song, Fabrizio Pece, Remo Ziegler, Otmar Hilliges 3DV, 2020 Convolutional autoencoders for filling in missing frames in human motion sequences, enabling cleanup of noisy or incomplete motion capture data. project page \| arXiv \| video \| code
	Structured Prediction Helps 3D Human Motion Modelling Emre Aksan, Manuel Kaufmann, Otmar Hilliges ICCV, 2019 Incorporating the skeletal structure of the human body into the output layer improves 3D human motion modelling. project page \| paper \| video \| code
	Deep Inertial Poser: Learning to Reconstruct Human Pose from Sparse Inertial Measurements in Real Time Yinghao Huang, Manuel Kaufmann, Emre Aksan, Michael Black, Otmar Hilliges, Gerard Pons-Moll ACM Transactions on Graphics (Proc. SIGGRAPH Asia), 2018 Real-time 3D human pose reconstruction from only six sparse IMU sensors, learning-based for the first time. project page \| paper \| video \| dataset \| code

Academic Service

Reviewing

2026	CVPR, ECCV
2025	CVPR, ICCV, SIGGRAPH
2024	CVPR, ECCV, SIGGRAPH, SIGGRAPH Asia
2023	CVPR, ICCV, 3DV
2022	CVPR, ECCV, 3DV, SIGGRAPH Asia, ACM UIST
2021	ICCV, SIGGRAPH, SIGGRAPH Asia
2020	ACM UIST, CVPR, ECCV, Eurographics, WACV
2019	Biocybernetics and Biomedical Engineering

Area Chair

2025

3DV

Outstanding Reviewer Awards

2025	CVPR
2022	ECCV

Design inspired by Jon Barron's website. Thanks for providing the source code.