Neural Capture & Synthesis
Imitator: Personalized Speech-driven 3D Facial Animation
We present Imitator, a speech-driven facial expression synthesis method, which learns identity-specific details from a short input video and produces novel facial expressions matching the identity-specific speaking style and facial idiosyncrasies of the target actor. Specifically, we train a style-agnostic transformer on a large facial expression dataset which we use as a prior for audio-driven facial expressions. Based on this prior, we optimize for identity-specific speaking style based on a short reference video.
2023, Aug 14 — 1 minute read [Paper] [Video] [Bibtex]CaPhy: Capturing Physical Properties for Animatable Human Avatars
We present CaPhy, a novel method for reconstructing animatable human avatars with realistic dynamic properties for clothing. Specifically, we aim for capturing the geometric and physical properties of the clothing from real observations. This allows us to apply novel poses to the human avatar with physically correct deformations and wrinkles of the clothing.
2023, Aug 14 — 1 minute read [Paper] [Video] [Bibtex]TADA: Text to Animatable Digital Avatars
We introduce TADA, a simple-yet-effective approach that takes textual descriptions and produces expressive 3D avatars with high-quality geometry and lifelike textures, that can be animated and rendered with traditional graphics pipelines.
2023, Aug 13 — 1 minute read [Paper] [Video] [Bibtex]TeCH: Text-guided Reconstruction of Lifelike Clothed Humans
TeCH reconstructs the 3D human by leveraging 1) descriptive text prompts (e.g., garments, colors, hairstyles) which are automatically generated via a garment parsing model and Visual Question Answering (VQA), 2) a personalized fine-tuned Text-to-Image diffusion model (T2I) which learns the "indescribable" appearance.
2023, Aug 13 — 1 minute read [Paper] [Video] [Bibtex]High-Res Facial Appearance Capture from Polarized Smartphone Images
We propose a novel method for high-quality facial texture reconstruction from RGB images using a novel capturing routine based on a single smartphone which we equip with an inexpensive polarization foil.
2023, Mar 30 — 1 minute read [Paper] [Video] [Bibtex]ClipFace: Text-guided Editing of Textured 3D Morphable Models
ClipFace is a novel self-supervised approach for text-guided editing of textured 3D morphable model of faces. Controllable editing and manipulation are given by language prompts to adapt texture and expression of the 3D morphable model.
2023, Mar 30 — 1 minute read [Paper] [Video] [Bibtex]MIME: Human-Aware 3D Scene Generation
Humans constantly interact with their environment. They walk through a room, touch objects, rest on a chair, or sleep in a bed. All these interactions contain information about the scene layout and object placement which we leverage to generate scenes from human motion.
2023, Mar 29 — 1 minute read [Paper] [Video] [Bibtex]INSTA: Instant Volumetric Head Avatars
Instead of prerecorded, old avatars, we aim to instantaneously reconstruct the subject's look to capture the actual appearance during a meeting. To this end, we propose INSTA, which enables the reconstruction of an avatar within a few minutes (~10 min) and can be driven at interactive frame rates.
2023, Mar 28 — 1 minute read [Paper] [Video] [Bibtex]DINER: Depth-aware Image-based NEural Radiance Fields
Given a sparse set of RGB input views, we predict depth and feature maps to guide the reconstruction of a volumetric scene representation that allows us to render 3D objects under novel views.
2023, Mar 28 — 1 minute read [Paper] [Video] [Bibtex]Neural Deformation Priors
We present Neural Shape Deformation Priors, a novel method for shape manipulation that predicts mesh deformations of non-rigid objects from user-provided handle movements.
2022, Dec 01 — 1 minute read [Paper] [Video] [Bibtex]MICA: Towards Metrical Reconstruction of Human Faces
Face reconstruction and tracking is a building block of numerous applications in AR/VR, human-machine interaction, as well as medical applications. Most of these applications rely on a metrically correct prediction of the shape, especially, when the reconstructed subject is put into a metrical context. Thus, we present MICA, a novel metrical face reconstruction method that combines face recognition with supervised face shape learning.
2022, Jul 04 — 2 minute read [Paper] [Video] [Bibtex]Texturify: Generating Textures on 3D Shape Surfaces
Texturify learns to generate geometry-aware textures for untextured collections of 3D objects. Our method trains from only a collection of images and a collection of untextured shapes, which are both often available, without requiring any explicit 3D color supervision or shape-image correspondence. Textures are created directly on the surface of a given 3D shape, enabling generation of high-quality, compelling textured 3D shapes.
2022, Jul 04 — 1 minute read [Paper] [Video] [Bibtex]Neural Head Avatars from Monocular RGB Videos
We present Neural Head Avatars, a novel neural representation that explicitly models the surface geometry and appearance of an animatable human avatar using a deep neural network. Specifically, we propose a hybrid representation consisting of a morphable model for the coarse shape and expressions of the face, and two feed-forward networks, predicting vertex offsets of the underlying mesh as well as a view- and expression-dependent texture.
2022, Mar 22 — 1 minute read [Paper] [Video] [Bibtex]Neural RGB-D Surface Reconstruction
We demonstrate how depth measurements can be incorporated into the neural radiance field formulation to produce more detailed and complete reconstruction results than using methods based on either color or depth data alone.
2022, Mar 22 — 1 minute read [Paper] [Video] [Bibtex]Mover: Human-Aware Object Placement for Visual Environment Reconstruction
We demonstrate that human-scene interactions (HSIs) can be leveraged to improve the 3D reconstruction of a scene from a monocular RGB video. Our key idea is that, as a person moves through a scene and interacts with it, we accumulate HSIs across multiple input images, and optimize the 3D scene to reconstruct a consistent, physically plausible and functional 3D scene layout.
2022, Mar 22 — 1 minute read [Paper] [Video] [Bibtex]Advances in Neural Rendering
This state-of-the-art report on advances in neural rendering focuses on methods that combine classical rendering principles with learned 3D scene representations, often now referred to as neural scene representations. A key advantage of these methods is that they are 3D-consistent by design, enabling applications such as novel viewpoint synthesis of a captured scene.
2022, Jan 01 — 2 minute read [Paper] [Bibtex]3DV 2021: Tutorial on the Advances in Neural Rendering
In this tutorial, we will talk about the advances in neural rendering, especially the underlying 2D and 3D representations that allow for novel viewpoint synthesis, controllability and editability. Specifically, we will discuss neural rendering methods based on 2D GANs, techniques using 3D Neural Radiance Fields or learnable sphere proxies. Besides methods that handle static content, we will talk about dynamic content as well.
2021, Nov 29 — 1 minute read [Video]SIGGRAPH 2021: Course on the Advances in Neural Rendering
This course covers the advances in neural rendering over the years 2020-2021.
2021, Aug 08 — 1 minute read [Video] [Bibtex]TransformerFusion: Monocular RGB Scene Reconstruction using Transformers
We introduce TransformerFusion, a transformer-based 3D scene reconstruction approach. From an input monocular RGB video, the video frames are processed by a transformer network that fuses the observations into a volumetric feature grid representing the scene; this feature grid is then decoded into an implicit 3D scene representation.
2021, Jul 12 — 1 minute read [Paper] [Video] [Bibtex]Dynamic Surface Function Networks for Clothed Human Bodies
We present a novel method for temporal coherent reconstruction and tracking of clothed humans using dynamic surface function networks which can be trained with a monocular RGB-D sequence.
2021, Apr 12 — 1 minute read [Paper] [Video] [Bibtex]Neural Parametric Models for 3D Deformable Shapes
We propose Neural Parametric Models (NPMs), a novel, learned alternative to traditional, parametric 3D models, which does not require hand-crafted, object-specific constraints.
2021, Apr 12 — 1 minute read [Paper] [Video] [Bibtex]RetrievalFuse: Neural 3D Scene Reconstruction with a Database
In this paper, we introduce a new method that directly leverages scene geometry from the training database. It is able to reconstruct a high quality scene from pointcloud or low-res inputs using geometry patches from a database and an attention-based refinement.
2021, Apr 12 — 1 minute read [Paper] [Video] [Bibtex]NerFACE: Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction
We present dynamic neural radiance fields for modeling the appearance and dynamics of a human face. To handle the dynamics of the face, we combine our scene representation network with a low-dimensional morphable model which provides explicit control over pose and expressions. We use volumetric rendering to generate images from this hybrid representation and demonstrate that such a dynamic neural scene representation can be learned from monocular input data only, without the need of a specialized capture setup.
2021, Mar 03 — 1 minute read [Paper] [Video] [Bibtex]Neural Deformation Graphs for Globally-consistent Non-rigid Reconstruction
We introduce Neural Deformation Graphs for globally-consistent deformation tracking and 3D reconstruction of non-rigid objects. Specifically, we implicitly model a deformation graph via a deep neural network. This neural deformation graph does not rely on any object-specific structure and, thus, can be applied to general non-rigid deformation tracking.
2021, Mar 03 — 1 minute read [Paper] [Video] [Bibtex]SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
We present a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations by learning to infer unobserved scene geometry and color in a self-supervised fashion.
2021, Mar 02 — 1 minute read [Paper] [Video] [Bibtex]Neural Non-Rigid Tracking
We introduce a novel, end-to-end learnable, differentiable non-rigid tracker that enables state-of-the-art non-rigid reconstruction. By enabling gradient back-propagation through a non-rigid as-rigid-as-possible optimization solver, we are able to learn correspondences in an end-to-end manner such that they are optimal for the task of non-rigid tracking.
2020, Sep 29 — 1 minute read [Paper] [Video] [Bibtex]Egocentric Videoconferencing
We introduce a method for egocentric videoconferencing that enables hands-free video calls, for instance by people wearing smart glasses or other mixed-reality devices.
2020, Sep 28 — 2 minute read [Paper] [Video] [Bibtex]Learning Adaptive Sampling and Reconstruction for Volume Visualization
We introduce a novel neural rendering pipeline, which is trained end-to-end to generate a sparse adaptive sampling structure from a given low-resolution input image, and reconstructs a high-resolution image from the sparse set of samples.
2020, Jul 22 — 1 minute read [Paper] [Bibtex]Intrinsic Autoencoders for Joint Neural Rendering and Intrinsic Image Decomposition
We propose an autoencoder for joint generation of realistic images from synthetic 3D models while simultaneously decomposing real images into their intrinsic shape and appearance properties.
2020, Jun 23 — 1 minute read [Paper] [Bibtex]CVPR 2020: Tutorial on Neural Rendering
Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering into network training. This state-of-the-art report summarizes the recent trends and applications of neural rendering.
2020, Apr 08 — 1 minute read [Paper] [Video] [Bibtex]State of the Art on Neural Rendering
Neural rendering is a new and rapidly emerging field that combines generative machine learning techniques with physical knowledge from computer graphics, e.g., by the integration of differentiable rendering into network training. This state-of-the-art report summarizes the recent trends and applications of neural rendering.
2020, Apr 08 — 2 minute read [Paper] [Bibtex]Adversarial Texture Optimization from RGB-D Scans
We present a novel approach for color texture generation using a conditional adversarial loss obtained from weakly-supervised views. Specifically, we propose an approach to produce photorealistic textures for approximate surfaces, even from misaligned images, by learning an objective function that is robust to these errors.
2020, Mar 19 — 1 minute read [Paper] [Video] [Bibtex]Image-guided Neural Object Rendering
We propose a new learning-based novel view synthesis approach for scanned objects that is trained based on a set of multi-view images, where we directly train a deep neural network to synthesize a view-dependent image of an object.
2020, Jan 15 — 2 minute read [Paper] [Video] [Bibtex]Neural Voice Puppetry:
Audio-driven Facial Reenactment
Given an audio sequence of a source person or digital assistant, we generate a photo-realistic output video of a target person that is in sync with the audio of the source input.
2020, Jan 08 — 1 minute read [Paper] [Video] [Bibtex]Deferred Neural Rendering:
Image Synthesis using Neural Textures
Deferred Neural Rendering is a new paradigm for image synthesis that combines the traditional graphics pipeline with learnable Neural Textures. Both neural textures and deferred neural renderer are trained end-to-end, enabling us to synthesize photo-realistic images even when the original 3D content was imperfect.
2019, Apr 28 — 2 minute read [Paper] [Video] [Bibtex]DeepVoxels: Learning Persistent 3D Feature Embeddings
In this work, we address the lack of 3D understanding of generative neural networks by introducing a persistent 3D feature embedding for view synthesis. To this end, we propose DeepVoxels, a learned representation that encodes the view-dependent appearance of a 3D object without having to explicitly model its geometry.
2019, Apr 11 — 1 minute read [Paper] [Video] [Bibtex]Research Highlight: Face2Face
Research highlight of the Face2Face approach featured on the cover of Communications of the ACM in January 2019. Face2Face is an approach for real-time facial reenactment of a monocular target video. The method had significant impact in the research community and far beyond; it won several wards, e.g., Siggraph ETech Best in Show Award, it was featured in countless media articles, e.g., NYT, WSJ, Spiegel, etc., and it had a massive reach on social media with millions of views.
2019, Jan 01 — 1 minute read [Paper] [Video] [Bibtex]ECCV 2018: Tutorial on Face Tracking and its Applications
This invited tutorial is about monocular face tracking techniques and also discusses the possible applications. It is based on our Eurographics state-of-the-art report.
2018, Sep 08 — 1 minute read [Paper] [Bibtex]Deep Video Portraits
Our novel approach enables photo-realistic re-animation of portrait videos using only an input video. The core of our approach is a generative neural network with a novel space-time architecture. The network takes as input synthetic renderings of a parametric face model, based on which it predicts photo-realistic video frames for a given target actor.
2018, May 29 — 1 minute read [Paper] [Video] [Bibtex]HeadOn: Real-time Reenactment of Human Portrait Videos
HeadOn is the first real-time reenactment approach for complete human portrait videos that enables transfer of torso and head motion, face expression, and eye gaze. Given a short RGB-D video of the target actor, we automatically construct a personalized geometry proxy that embeds a parametric head, eye, and kinematic torso model. A novel reenactment algorithm employs this proxy to map the captured motion from the source to the target actor.
2018, May 29 — 1 minute read [Paper] [Video] [Bibtex]InverseFaceNet: Deep Monocular Inverse Face Rendering
We introduce InverseFaceNet, a deep convolutional inverse rendering framework for faces that jointly estimates facial pose, shape, expression, reflectance and illumination from a single input image. This enables advanced real-time editing of facial imagery, such as appearance editing and relighting.
2018, May 16 — 1 minute read [Paper] [Video] [Bibtex]State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications
This report summarizes recent trends in monocular facial performance capture and discusses its applications, which range from performance-based animation to real-time facial reenactment. We focus on methods where the central task is to recover and track a three dimensional model of the human face using optimization-based reconstruction algorithms.
2018, Apr 24 — 1 minute read [Paper] [Bibtex]Eurographics 2018: State of the Art on Monocular 3D Face Reconstruction, Tracking, and Applications
This state-of-the-art report session summarizes recent trends in monocular facial performance capture and discusses its applications, which range from performance-based animation to real-time facial reenactment. We focus on methods where the central task is to recover and track a three dimensional model of the human face using optimization-based reconstruction algorithms.
2018, Apr 24 — 1 minute read [Paper] [Bibtex]FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality
We propose FaceVR, a novel image-based method that enables video teleconferencing in VR based on self-reenactment. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture of an actor who is wearing a head-mounted display (HMD).
2018, Mar 21 — 1 minute read [Paper] [Video] [Bibtex]Dissertation: Face2Face - Facial Reenactment
This dissertation summarizes the work in the field of markerless motion tracking, face reconstruction and its applications. Especially, it shows real-time facial reenactment that enables the transfer of facial expressions from one video to another video.
2017, Oct 16 — 2 minute read [Paper] [Bibtex]FaceForge: Markerless Non-Rigid Face Multi-Projection Mapping
In this paper, we introduce FaceForge, a multi-projection mapping system that is able to alter the appearance of a non-rigidly moving human face in real time.
2017, Oct 10 — 1 minute read [Paper] [Video] [Bibtex]SIGGRAPH Emerging Technologies: Demo of FaceVR
We present a novel method for the interactive markerless reconstruction of human heads using a single commodity RGB‐D sensor. Our entire reconstruction pipeline is implemented on the graphics processing unit and allows to obtain high‐quality reconstructions of the human head using an interactive and intuitive reconstruction paradigm.
2017, Aug 03 — 1 minute read [Paper] [Video] [Bibtex]FaceInCar Demo at the National IT Summit 2016
We demonstrate the capabilities of the dense face fitting proposed in Face2Face in the challenging scenario of face tracking in a car including occlusions and strong varying light situations.
2016, Nov 17 — 2 minute readSIGGRAPH Emerging Technologies: Real-time Face Capture and Reenactment of RGB Videos
We show a demo for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). Our goal is to animate the facial expressions of a target video by a source actor and re-render the manipulated output video in a photo-realistic fashion.
2016, Jul 28 — 1 minute read [Paper] [Video] [Bibtex]GPU Technology Conference: Interactive Demo of Face2Face
Nvidia invited us to show a demo for our real-time facial reenactment system (Face2Face). Our goal is to animate the facial expressions of a target video by a source actor and re-render the manipulated output video in a photo-realistic fashion.
2016, Apr 07 — 1 minute read [Paper] [Video] [Bibtex]