Real Virtual Humans: The Path from Statistical Models to Neural Avatars that Act and Behave

Gerard Pons-Moll

University of Tübingen, DE

Abstract

The future of virtual humans lies in their ability to seamlessly interact with both real humans and 3D virtual worlds. Early models in the field focused on the fundamental task of human appearance modeling, using statistical models of 3D humans that learn body shape and pose from 3D scans. These foundational models paved the way for more advanced representations of human appearance, including clothing and finer details. However, humans are much more than their bodies. Recent breakthroughs have shifted the focus toward modeling human behavior. This includes systems that can capture human-object interactions from video data and synthesize human behavior within complex 3D environments. In the first part of this course, we will delve into the mathematical foundations of designing and learning statistical models. Students will explore advanced techniques for modeling human appearance, including clothing, based on neural implicit fields and Gaussian Splatting. Additionally, we will cover methods for extracting 3D information using diffusion based image foundation models. The second part of the course will focus on the dynamics of human behavior. Students will learn how to capture and synthesize human interactions with objects and environments, and how to create text-driven avatars that reason and adapt within 3D worlds. The ultimate goal is to develop virtual humans that are indistinguishable from real ones—the Real Virtual Humans.